Table Of Content

Fingerprinting Internet Paths using Packet Pair Dispersion USC Computer Science Technical Report No. 06-876 Rishi Sinha Christos Papadopoulos John Heidemann DepartmentofComputer DepartmentofComputer DepartmentofComputer Science Science Science UniversityofSouthern UniversityofSouthern UniversityofSouthern California California California [email protected] [email protected] [email protected] ABSTRACT Path fingerprinting is especially useful for applications that require the selection of disjoint paths to improve per- Pathfingerprintingisanessentialcomponentofapplications formance or robustness. Path fingerprints allow such appli- that distinguish among different network paths, including cations to determine if two paths share common links. A pathselectioninoverlaynetworks,multi-pathrouting,mon- simple path fingerprint is the identity of all routers on the itoringanddiagnosisofnetworkproblems,anddevelopinga path. Whiletoolssuchastracerouteaim toprovidethisin- deeperunderstandingofnetwork behavior. Thispaperpro- formation,thesetoolsfailwhentheroutertopologybecomes poses a new approach toInternet path fingerprinting based opaque due to layer-2 clouds (from ATM or optical switch- onthedistributionofend-to-endpacket-pairmeasurements. ing),tunneling(suchasMPLSorVPNs),ornon-cooperative Thisapproachallowsdetectionofbusylinksharingbetween routers. two paths, even when those segments have low utilization While an exact router-level topology is difficult to deter- and are not the paths’ bottlenecks. While our fingerprints mine,usersoptimizingnetworkperformanceareofteninter- do not assure physically disjoint paths (since that requires ested in performance isolation. Here the goal is to identify information externaltothenetwork),theyreflectthetraffic paths that do not share highly utilized links. For such ap- andlinkcharacteristics ofintermediatelinks. Thismethod- plications,theexistenceofsharedbottlenecksmustbeiden- ology is therefore tolerant of opaque clouds such as VPNs, tified, but over-provisioned links are of much less concern. VLANs, or MPLS (unlike traceroute). Using analysis and Ideally,pathfingerprintsshouldalsoberelativelystable,al- simulation we explore the network factors that affect the lowing data collection costs to beamortized byreusing and fingerprints, and we introducea simple method to compare sharing fingerprints. them. Throughmeasurementsofuptoayearover15Inter- In this paper we develop a new technique to fingerprint net paths, we show that our fingerprints are both distinct Internet paths based on the distribution of packet pair dis- and persistent overperiods of several months,making their persions, which we call the dispersion fingerprint of a path collection and use for path selection feasible. (Section2). Ourapproachhasthefollowingimportantprop- erties: itgeneratesdistinct pathfingerprintsthatcanbere- 1. INTRODUCTION liably compared using simple techniques; fingerprintstaken Path characterization is aprocess thatmeasures theper- atthesametimeofdayarepersistent overperiodsofatleast formance of apath,typicallycapacity,loss, delay. Thegoal several months, and are thus reusable; and the fingerprints of path characterization is to describe something about the areshapedbytrafficandlinkcharacteristics,especiallylinks path itself. Path fingerprinting is a process that generates withhigherutilization,andcanbeusedtodetectpathover- a unique fingerprint for a path, typically aimed at distin- lap. guishing one path from another. The fingerprint may or To understand how dispersion fingerprints are shaped by maynotbedependentonperformance-related propertiesof thenetworkwefirstsystematicallystudyfingerprintsthrough thepath. Bothpathcharacterization andfingerprintingare network simulation (Section 3), and compare these results important to a wide range of applications such as overlay with fingerprints from over 15 paths on Internet2 for pe- path selection, multi-path routing, monitoring and diagno- riods of up to a year (Section 4). We corroborate these sis ofnetwork problems, designingandtesting protocols for results with short-term measurements from commercial In- realisticnetworkconditions,anddevelopingadeeperunder- ternet paths (Section 5). Finally, (Section 6) we propose standing of network behavior. twoapplicationsoffingerprints,namelydetectingsharingof busy links between two paths, and detection of traffic and linkchangesovertimeonapath,anddiscusstheadvantages of ourapproach over existing solutions. John Heidemann and Christos Papadopoulos are partially supported by Toourknowledge,theexistenceofdistinctandpersistent the United States Department of Homeland Security contract number pathfingerprintsthatcapturelinkandtrafficcharacteristics NBCHC040137(“LANDER”).RishiSinhaissupportedbytheIntegrated has not been demonstrated before. While there has been Media Systems Center, a National Science Foundation Engineering Re- a great deal of work on path characterization [3,5–7,12, searchCenter,CooperativeAgreementNo. EEC-9529152. Anyopinions, findingsandconclusionsorrecommendationsexpressedinthismaterialare 26,35], most has focused on determining specific network thoseoftheauthorsanddonotnecessarilyreflectthoseoftheDepartment characteristics such as loss, delay, and throughput rather ofHomelandSecurityortheNationalScienceFoundation. 1 δ δ than fingerprinting. Thepacket pair methodology hasbeen pre post Probe traffic widelyusedbeforeinlinkcapacityandavailablebandwidth P2 P1 P2 P1 Cross traffic estimation [9,10,13,18,20,24,30]. Ourwork differsbecause weaimtofindanindicatorofapath’suniqueidentityrather than characterizing performance aspects. (a) Nochange, δpost =δpre. Our contributions are as follows: (a) we define disper- δ sion fingerprints as a new approach to identifying paths pre δ post shaped by traffic and link characteristics; (b) we investi- P2 P1 P2 P1 P2 P1 gatetheunderlyingphysicalbasisfordispersionfingerprints anddemonstratethroughsimulation,analysisandmeasure- ment how links and cross traffic affect dispersion; and (c) (b) Link expansion, δpost >δpre. we demonstrate that that many Internet paths contain a δ distinct and persistent dispersion fingerprint. In addition, pre δ webegintoexplorehowourfingerprintscanbeusedtoiden- post P2 P1 P2 P1 tify paths with shared links and detect traffic changes on a P2 P1 path. (c) Traffic expansion, δpost >δpre. 2. THEDISPERSIONDISTRIBUTIONASA PATHFINGERPRINT δ pre δ post Inthissectionwepresentevidencethatthedistributionof P2 P1 P2 P1 packetpairdispersionsisagoodcandidateforfingerprinting P2 P1 Internet paths. 2.1 Packet PairBasics (d) Traffic compression, δpost <δpre. Packetpairs[18]andtheirvariants(suchaspackettrains), are a useful building block commonly used in tools for link Figure 1: Examples of how packet pair dispersion capacity andavailable bandwidthestimation [7,9,10,13,20, may change. 24,26,30]. A packet pair typically consists of two of equal- sized packets sent with fixed initial dispersion δinit. We 2.2 CanDispersionValuesProvideaPathFin- usethetermdispersion asit iscommonly defined: thetime intervalbetweenthefirstbitofthefirstpacketandthefirst gerprint? bit ofthesecond packetofthepair. Atthedestination,the Thediscussionabovesuggeststhatpacketpairdispersions resultingdispersion δ isafunctionofboththephysical final reflectfeaturesofboththestatic(physical)andthedynamic capacityofthevariouslinksinthepathaswellascompeting (cross-traffic) properties of a path. However, fingerprints cross traffic on theselinks. are most useful to the extent that they are distinct and Figure1showshownetworklinksandtrafficaffectdisper- persistent over at least moderate periods of time. We look sion. Observe the incoming and outgoing dispersions (δpre into these questionsnext. andδpost)atthefirstrouterineachcase. Withnointerfer- Figure 2(a) shows the results of an experiment where a ence and equal capacity links, case (a) shows that there is sustained stream of UDP packet pairs was transmitted for nochangeindispersion. Incase(b),thepacketpairexperi- a period of one hour between hosts in Germany and Mas- ences expansion (the dispersion increases) as it moves from sachusetts. WecallthispathDE–UM.Theinitialdispersion a high-bandwidth to a low-bandwidth link. In case (c), the of each pair was 120 µs, which corresponds to back-to-back dispersion valueagain increases, butthistimeduetocross- transmissionofthetwo1500bytepacketsat100Mb/s. (Full traffic packets slipping between the packet pair, increasing details about our methodology are in Section 3.1.) The fig- theirseparation. Finally,incase(d),thepacketpairexperi- ureshows finaldispersion δ on they-axiswith time on final ences compression due to queuingat a busy link. The final thex-axis. Thisscatterplotshowsstronghorizontalbands, dispersion δfinal at the end of the path is due to combina- indicating frequent dispersion modes around 135, 160, 200, tions of these effects along the entire path. We reevaluate 240, 280, 320, 360 µs, etc. these cases more precisely in Section 3.2. Althoughthefigureshows strong trendsin thedata, it is Several other network factors can also affect dispersion not obvious how to interpret it. We would like to quantify values from packet pair measurements. Scheduling policies, thestrengthofgivenbandsandunderstandtheircauses, so multi-path routing and route changes could all affect mea- weswitchtoacumulativedistributionofthedatasetinFig- surements. Like most other prior work using packet pairs, ure 2(b). This representation makes quantification of the we assume the common case of FIFO or RED queuing, dispersions easier, since the strength of each band is pro- single-path routing and stable routes between probes. portional to the size of the step in the CDF. Figure 2(b) In this paper we explore the use of packet pairs to char- alsoaddstheCDFofasimilarexperimentconductedabout acterize network paths. From these examples we observe a month later. The strong similarity of the CDFs in Fig- that the network can both increase or decrease the disper- ure 2(b) suggests that the bands in Figure 2(a) are not sion value unpredictably, making it an apparently difficult transient but may be caused by the underlying network, choice. We next explore how the the network forces regu- indicatingthatdispersion CDFs may provide persistent fin- larities in dispersion values, allowing a modest number of gerprints. probes to characterize a link. Turning to the question of the distinctness, Figure 2(c) 2 1 1 0.8 0.8 F 0.6 F 0.6 D D C 0.4 C 0.4 Nov 2004 0.2 Aug 2005 0.2 Jan 2005 pre-change Sep 2005 Jan 2005 post-change 0 0 0 120 240 360 480 600 720 0 120 240 360 480 600 720 Dispersion (microseconds) Dispersion (microseconds) (a) Germany to Massachusetts (path (b) Germany to Massachusetts (path (c) California to Korea (path SC- DE-UM),Aug. 2005. DE-UM). KR). Figure 2: Measurements motivating the use of dispersion distributions as path signatures. showsdispersionCDFsfromthreemeasurementstakenwith 100 s (simulation) or 1hour(Internet). (Weexplore theef- thesamemethodologyonadifferentpath,CaliforniatoKo- fect of packetpairrate and measurement duration in detail rea(SC–KR).WeobservethattheDE–UMCDFswerevery in Section 5, and gets similar results with rates as low as similar to each other, while the SC–KR measurements ap- 10 pairs/s and durations as short as 30 s.) Each packet is pear quite different. This observation suggests that disper- 1500 bytes long (including UDP and IP headers) We send sion CDFs may be distinct for a given path. In Section 4 the two packets back-to-back at the rate of the access link wemeasure15differentInternetpathsoverperiodsofupto (typically 100 Mb/s), so δinit is 120 µs. To factor out link a year, and we find each one to have a distinctly different speedwenormalizethedispersionatthereceiverbytheini- CDF. tial dispersion and report δ˜= δfinal/δinit, the normalized While the DE–UM CDFs (Figure 2(b)) are very consis- dispersion. tent,weseetwodifferenttrendsinSC–KR(Figure2(c)). In January2005,therewasasignificantandpermanentchange 3.2 AnalysisofDispersionAfterASingleRouter in the CDF of this path—before the change we see two We begin with simple one-hop scenarios with dispersion modes (at 120 and 240 µs), but later only the first mode from packet pairs interacting with cross traffic. remains. We associate this change with a network outage We use a trivial topology: one router with two incom- andtemporaryroutechange,suggestingthattheroutingfa- ing and oneoutgoing link (Figure 3(a)). Oneincoming link cilities where altered at this time. We explore this example carries the packet pairs, the other carries cross traffic, and in detailin Section 4.5. Hereweobserve that large changes all packets exit the router through the outgoing link. Both in dispersion CDFs can be associated with changes in the incoming links have the same capacity C , while the out- pre underlyingnetwork;dispersionCDFsarenotcompletelyun- going link may have the same or different speed C . We post changing. assume that a packet pair arrives at the router with arbi- In this section we showed preliminary evidence that dis- trary dispersion δpre and leaves with dispersion δpost. We persion CDFsof Internetpathscan persist for months,and assumepacketpairs andcrosstrafficpacketshavethesame can be distinct, suggesting that dispersion CDFs may pro- length, but write them L and L to allow this assump- P X videanewtypeofpathfingerprint. Inthefollowingsections, tiontoberelaxedlater. Section3.4.3 dealswiththecaseof we investigate in detail the factors affecting our fingerprint multiple packet sizes in cross traffic. and its properties and examine dispersion CDFs through To describe how large a particular value of dispersion is, simulation and a larger set of Internetmeasurements. it is helpful to define dispersion slack, S, the duration the linkwould beidlebetweenthetwoprobesifnocrosstraffic 3. SYSTEMATICSTUDYOFTHEDISPER- were present (see Figure 3(a)). For any incoming pair, the SIONCDF inputslackisSpre=δpre−LP/Cpreandtheoutputslackis Spost =δpost−LP/Cpost,whereδpost istheoutputdisper- We next explore factors that influence dispersion CDFs, sion that would result if no cross traffic contended for the beginningwithsimpleone-routerscenariosandcase-by-case output link. Note that Spost =max(0,δpre −LP/Cpost). analysis of cross-traffic. We then generalize this discussion Wewanttoconsidereachwaycrosstrafficmayalterprobe of single-hop dispersion, dividing the set of conditions into dispersion. To do this, we consider when a potential cross- threeregionsofoperation. Finallyweturntosimulationsof traffic packet arrives, and what the spacing of the packet small scenarios with multiple routers. pair is. We assume non-preemptive queueing, so we can Dispersionatasinglehophasbeenstudiedbefore[13,21, discard all cases when cross traffic arrives after the head of 25] when considering bandwidth estimation. Our goal here thesecond probe packet. is to understand the factors most pertinent to dispersion CDFs used as path fingerprints. 3.2.1 CrossTrafficArrivingBetweentheProbes First,considerwhenacross-trafficpacketarrivesafterthe 3.1 Methodology for Computing the Disper- headofthefirstprobepacketbutbeforetheheadofthesec- sionCDF ond. Inthiscase,itwillinterfere,delayingthesecondpacket Oursimulationsandexperimentsusethefollowingmethod- and increasing dispersion, but theincrease is dependent on ology. WesendUDPpacketpairswith exponentialinterde- theoutput slack S . post parture times (exploiting the PASTA principle [33]) at a When there is no slack (S =0), then any intervening post mean rate of 100 pairs/s. Experiment lengths are either packetwillincreasedispersionbyexactlytheservicetimeof 3 δ pre δ Probe traffic arrives, decreasing δpost to zero (point-constrained). P2 P1 post Cross traffic IfSpost ≥β−LX/Cpost,thecrosstrafficagaindelaysthe P2 P1 firstprobe,decreasingdispersion,butthepairsdonotleave back-to-back(range-constrained). If β = 0 and S ≥ L /C , then the dispersion is post X post reduced by one full packet service time; δpost = δpre − Slack on inputSpre Slack on outputSpost LX/Cpost, and dispersion is point-constrained. (a) Dispersion slack. In these examples, the reduction in dispersion occurs because packets queue behind cross traffic. This cause is the Point−constrained Range−constrained Unconstrained same cause of ACK compression [34], but in our case, with data packets. 1 (cid:0)(cid:1)(cid:0)(cid:1) (cid:0)(cid:1)(cid:0)(cid:1) (cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1) (cid:0)(cid:1)(cid:0)(cid:1) (cid:0)(cid:1)(cid:0)(cid:1) (cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1) (cid:0)(cid:1)(cid:0)(cid:1) (cid:0)(cid:1)(cid:0)(cid:1) (cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1) 3.2.3 FasterOutputLink F (cid:0)(cid:1)(cid:0)(cid:1) (cid:0)(cid:1)(cid:0)(cid:1) (cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1) CD (cid:0)(cid:0)(cid:1)(cid:1)(cid:0)(cid:0)(cid:1)(cid:1) (cid:0)(cid:0)(cid:1)(cid:1)(cid:0)(cid:0)(cid:1)(cid:1) (cid:0)(cid:0)(cid:1)(cid:1)(cid:0)(cid:0)(cid:1)(cid:1)(cid:0)(cid:0)(cid:1)(cid:1)(cid:0)(cid:0)(cid:1)(cid:1) Next consider different link speeds, Cpre 6= Cpost. First (cid:0)(cid:1)(cid:0)(cid:1) (cid:0)(cid:1)(cid:0)(cid:1) (cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1) consider the case when the output link is faster than the (cid:0)(cid:1)(cid:0)(cid:1) (cid:0)(cid:1)(cid:0)(cid:1) (cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1) 0 0 1 2 0 (cid:0)(cid:1)1(cid:0)(cid:1) (cid:0)(cid:1)(cid:0)(cid:1) 2 0 (cid:0)(cid:1)(cid:0)(cid:1)1(cid:0)(cid:1)(cid:0)(cid:1) 2 input link, Cpost > Cpre. Assume the packets arrive back- to-back on the input link, with no idle time between them, Normalized dispersion so δpre = LP/Cpre and Spre = 0. This still implies non- (b)Dispersion CDFs for each region of operation. zerooutputslack. Specifically,theslackSpost =LP/Cpre− L /C , a value greater than zero, since C > C . P post post pre This slack on the output link creates the potential for the Figure 3: Dispersion slack is the key to the regions faster link to silently “absorb” some cross traffic, provided of operation. thecrosstrafficcanslipintotheoutputlink’sslack. Wewill explorethisissueinmoredetailwhenweconsidersimulation results in thenext section. that new packet. In other words, δpost =δpre +LX/Cpost 3.2.4 MultipleorFasterInputLinks (ifinterference),orδpost =δpre (ifnot). Wecallthisregion of operation point-constrained, because dispersion increases With multiple sources of cross traffic or a faster single by a fixed amount. sourceofcrosstraffic,dispersionvaluescanincreasebyinte- Next, suppose there is a bit of slack, but less than a full gralmultiplesofeachoftheabovecases. (Weassumeround- packet’s worth, that is, 0 < S < L /C . Again, robin servicing of all input links.) So if the probes arrive post X post if no cross traffic arrives dispersion is unchanged, but the with noslack, thentheymayexitwith δpost =LP/Cpost+ longer inter-packet gap makes this chance lower than when n(L /C ) for any integer n>0 in the point-constrained X post point-constrained. If cross traffic arrives immediately be- cases. If there is more slack when the probes enter, they fore the second probe packet,it will increase the dispersion will be constrained into range “stripes” with the maximum by LX/Cpost (the same as point-constrained). However, if of each range at some multiple n(LX/Cpost), and the mini- the packet enters a bit earlier, than part will be transmit- mumatsomemultiplen(L /C −S ). (Notethatthe X post post ted before thesecond probe packet enters, and the increase width of therange is wider at higher multiples.) in dispersion will be slightly smaller: LX/Cpost−ǫ. If the Afastersinglesourceofcrosstrafficcauseschangessimilar intervening packet enter immediately after the first probe to multiple links. In the next section we will simulate cross packet, it can consume all the slack. Thus, an interven- traffic entering in link speeds ten times that of the probe ing packet may increase dispersion by any value between traffic. L /C −S and L /C . We therefore call this re- X post post X post 3.2.5 MultipleHops gion of operation range-constrained. Finally, if there is more than a packet’s worth of slack Crosstrafficovermultiplehopscausesincreaseordecrease (Spost ≥LX/Cpost),thenthedispersioncanincreasebyany in dispersion at everyhop. Moreover, an increasein disper- amount between 0 and LX/Cpost, since the slack can now sion at one hop increases the slack available at thenext, so absorb an entire cross-traffic packet. We call this region of interactionsquicklybecomecomplicated. Weconsiderthem operation unconstrained. in simulation in thenext section. Figure3(b)showsdispersion CDFsforthesecases, show- 3.3 Generalized Regions of Operation on a ingwhenδpre iseitherLP/Cpost,1.5LP/Cpostor2LP/Cpost. SingleHop 3.2.2 CrossTrafficArrivingBeforetheFirstProbe InSection 3.2 wesaw specificexamples of howpacketin- Next we consider when cross traffic arrives β before the teractions give rise to three regions of operation at a single head of the first probe. If it arrives well before the first hop. Now we generalize the conditions underlying each re- probe, when β ≥ L /C , then it can be completely ser- gion. for arbitrary combinations of the packet arrival and X post viced and does not affect dispersion. Otherwise the effects linkspeedscenariosseparatedabove. Therearenoassump- again depend on slack. tions other than constant cross traffic packet size. Let n If S =0, then both probes are delayed by β, but dis- be the numberof intervening packetsof cross traffic, which post persiondoesnotchangebecausetheprobesremainback-to- may be zero. Each packet pair falls into one of the three back. regions (Figure 3(b)) depending on its value of slack S , post If 0<Spost <β−LX/Cpost, thenthecross traffic delays which is determined by δpre. When δpre ≤ LP/Cpost, the the first probe’s service time until after the second probe slack S is zero, and this is the point-constrained region, post 4 Case S Case SS Case FF illustrated in these simulations. S R1 D S R1 R2 D S R1 R2 D 50% util. 50% util. Case S.Figure5(a) showsthedispersion CDFswithho- mogeneous link speeds. In this topology, all pairs arrive at the busy link with a dispersion L /C (where C is the Case F Case SF Case FS P S R1 D S R1 R2 D S R1 R2 D capacity of R1–D) and have zero slack. This scenario is 50% util. 50% util. point-constrained, so we see steps in the CDF only at integer multiples of L /C. Considering the case of light load X (30% utilization), we see that about 75% of pairs show no Figure 4: The six possible topologies involving one dispersion change—theyenterandleavethecongestion link ortwobusylinksandtwolinkcapacities. Thicklines without change in their spacing. About 20% of pairs show represent fast (F) links, while thin lines represent a dispersion of 2δinit = LP/C +LX/C, corresponding to slow (S) links. one cross-traffic packet arriving at R1 after the first probe of the pair but before the second. Finally, we occasionally (about5%ofthetime)seedispersionfactors higherthan2, where each value of n results in exactly one value of δpost. correspondingtocaptureofmultiplepacketsofcrosstraffic. WhenLP/Cpost <δpre <(LX+LP)/Cpost,theslackSpost Since the packet size is fixed, we always see CDF steps at canabsorbonlypartofacrosstrafficpacket,andthisisthe multiples of the packet size, and we characterize this kind range-constrained region, where each value of n gives rise of CDFas stairstep. to non-overlapping“stripes” of possible δout values. When With 90% load, again we see a stairstep dispersion CDF, δpre ≥(LX +LP)/Cpost, the slack Spost can absorb one or butwith morecrosstraffic, large dispersionsaremore com- more cross traffic packets, and the “stripes” now overlap, mon since multiple packet insertions between probes are making any δout valueabove LP/Cpost possible. more common. ThesesimulationsconfirmtheanalysisofSection3.2and3.2.4. 3.4 Simulation of Dispersion With Multiple With Poisson cross traffic we can compute the expected Routers CDF. The fraction with no dispersion change correspond to the probability that no cross traffic arrives in duration Toexploremorecomplexscenarios,wenowusesimulation L /C. The expected number of arrivals in this time is and the six topologies shown in Figure 4. We send packet P (L /C)×(λ /L ), where λ is the average data rate of pairsfromsourceStothedestinationDthroughroutersR1 P X X X crosstraffic. TheobservedCDFsincaseSareinagreement and, when present, R2. All links except the access link (S– with this distribution. R1) carry cross traffic. Weuse eitherslow links (100 Mb/s, Case F.NextweturntocaseF,whereafastlinkcarries thin lines), or fast links (1 Gb/s, bold lines). We label the cross traffic. Since the capacity C of the busy link R1–D topologies using S (for Slow) and F (for Fast) based on the is 10 times that of the access link, the incoming dispersion speeds of the busy links, so SF indicates two busy links, at R1 is 10(L /C) = 5(L +L )/C, which puts all pairs a slow link followed by a fast link. (Links feeding cross P P X in this case in the unconstrained region. Figure 5(b) shows trafficarealwaysfastlinkstoallowsingleormultiplepacket its CDF. which we describe as smooth. Rather than the arrivals between probes.) large,discretestepsofthestairstepCDFweobtainedinthe Ineachtopologywevarycrosstrafficsuchthatutilization point-constrained case, here we see a spread of dispersion at the last link is 30%, 50% or 90%. (Values from 10% to 90% were similar.) With two busy links (SS, SF, FF and values, centered at δin. Likecase S, normalized dispersions FS),wefixcrosstrafficonR1–R2at50%andvaryutilization greater than δinit correspond to cross traffic arriving be- tweentwoprobesandspreadingthemapart. Unlikecase S, on the R2–D link. Initially, we assume cross traffic is only the amount of increase is unconstrained, and thus the re- 1500 bytepacketswith Poisson arrivals; in Section 3.4.3 we sulting CDFshows weight at continuousvaluesratherthan consider multiple packet sizes. Cross traffic is sent using discretesteps. AnotherdifferencefromcaseSisthatwesee exponential interarrival times. An open issue is to explore outputdispersionsδ thataresmaller thaninputdisper- non-Poisson distributions of cross traffic. final Probe traffic is injected on the 100 Mb/s link S–R1; the sion δinit, since, as we argued analytically in Section 3.2.2, same speed as slow intermediate links, but one-tenth the pairscanbepushedtogetheriftheymustqueuebehindcross speed of fast intermediate links. The probes are sent as traffic. However, thelower bound on δfinal is LP/C. described in Section 3.1 back-to-back (S =0). Comparing theamountsof cross trafficfor case F,wesee pre We use the ns-2 simulator [8]. Each simulation run lasts thatwhenthereismorecrosstraffic,wegethighervariation 100 s, but we discard the first 5 s to avoid start-up tran- in the dispersion (compare the 90% to 30% utilizations). sients. (The simulation is stable after 10 s, so the duration Again, this result is due to a higher probability of probes is ample.) Werepeated each simulation 5 timeswith differ- capturing multiple cross packets. At 90% load, there are ent randomization but found no significant variation, so we stepsatmultiplesofLX/C (whichis0.1δinit),becauseboth present theresults of a single instance here. probes in a pair are often queued along with intervening cross traffic, resulting in zero output slack for many pairs. 3.4.1 TheDispersionCDFwithOneBusyLink 3.4.2 TheDispersionCDFwithMultipleBusyLinks We begin with the simplest case: one busy link that is either the same speed or faster than the access link of the We now turn to cases SS, FF, SF and FS, where there sender. We will find that these two cases are in the point- are two busy links, R1–R2 and R2–D. In each of the pre- constrained andunconstrainedregionsofoperation, respec- vious cases S and F, the incoming dispersion at the busy tively. Range-constrained operation is an intermediate re- linkwasthesameforall pairs,soallpairswereinthesame gionwithcharacteristicsoftheothertworegions,andisnot region of operation. In the multiple-busy-linkcases, the in- 5 1 1 1 0.8 0.8 0.8 F 0.6 F 0.6 F 0.6 D D D C 0.4 C 0.4 C 0.4 R1-D utilization 30% 0.2 R1-D utilization 50% 0.2 R2-D utilization 30% 0.2 R2-D utilization 30% R1-D utilization 90% R2-D utilization 90% R2-D utilization 90% 0 0 0 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 Normalized dispersion Normalized dispersion Normalized dispersion (a) Case S. (c) Case SS. (e) Case FF. 1 1 1 0.8 0.8 0.8 F 0.6 F 0.6 F 0.6 D D D C 0.4 C 0.4 C 0.4 R1-D utilization 30% 0.2 R1-D utilization 50% 0.2 R2-D utilization 30% 0.2 R2-D utilization 30% R1-D utilization 90% R2-D utilization 90% R2-D utilization 90% 0 0 0 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 Normalized dispersion Normalized dispersion Normalized dispersion (b) Case F. (d)Case SF. (f) Case FS. Figure 5: Dispersion CDFs from the six simulation cases. comingdispersionsatR1arestillthesameforallpairs,but ing at R2 again have dispersions distributed as the output pairs entering R2 may have different dispersions because of of case F with 50% load, but in this case dispersions up to interference at R1. Consequently, all pairs do not operate δinit are point-constrained, and dispersions between δinit in the same region on R2–D. For multiple link cases we fix and 2δinit are range-constrained. A very small number of the utilization of R1–R2 at 50%, so the 50% load curves in pairs with dispersion more than 2δinit are unconstrained. Figures 5(a) and 5(b) show the distribution of dispersions Since most of the dispersions are point-constrained, Fig- approachingR2forcasesSSandSF,orFSandFF,respec- ure5(f)showsstepsatintegervaluesinthe30%-loadcurve. tively. The effect of range-constrained dispersions is also evident Case SS. When both busy links are slow links (Fig- betweennormalized dispersions1and2,thoughthehigher- ure 5(c)), the incoming dispersions at R2 are either in the valued band repetitions are too weak to be visible. It is point-constrainedorunconstrainedregion. Thisisseenfrom interesting that cases FS and SF are similar in topology, the50%curveinFigure5(a)—approximately60%ofdisper- but there are differences in their dispersion CDFs. From sionsstayunchangedatδinit,andthesearepoint-constrained this observation we conclude that the order of busy links with respect to R2–D; the rest of the dispersions are too affects the dispersion CDF. large to be in the range-constrained region, and are all in TheseresultsdemonstratehowthedispersionCDFisshaped theunconstrainedregionwithrespecttoR2–D.The60%of bylinkcapacities, utilizations and orderof busylinkswhen pairs that are point-constrained produce jumps at integer there is traffic on multiple links. values in Figure 5(c), while the rest of the pairs produce both integer and non-integer values. Although the regions 3.4.3 TheDispersionCDFwithMultiplePacketSizes of operation don’t change at 90% utilization of R2–D, the Crosstrafficinalltheabovesimulationsconsistsofasin- higher load in the last link dominates the CDF, forcing the gle packet size. To consider multiple packet sizes, we must dispersion CDF intoa nearly stairstep appearance. revisit theclassification intoregions ofoperation, aswell as Case SF. Now consider case SF, where the second busy thesimulations. link is a fast link (Figure 5(d)). The incoming dispersions In general, the three regions still exist for multiple cross at R2are still distributed astheoutput of case S with 50% traffic packet sizes, using the smallest packet size for L in X load. However, because R2–D is now faster than the probe theexpressions inSection 3.3. Inparticular cases, however, access link, router R2 operates in the unconstrained region itispossiblefortherange-constrainedregiontomergewith forallpairs. Weexpectastairstepdistributiontobeinduced theunconstrained region. Thishappenswhenbandsdueto at router R1, and at 30% utilization these steps are each different packet sizes overlap to cover an unbounded range smoothed as in case F. At 90% utilization, the cross traffic of δ values. final at R2 dominates the dispersion CDF giving it an overall ThetypicalInternetpacketsizedistributionisbelievedto smooths shape,butactually withmanysmallbumpsatthe consist mainly of a few strong modes, around 40, 576 and 1 Gb/s back-to-backspacing. 1500 bytes [23,29,32]. We recently reported [28] that this Case FF. For case FF (Figure 5(e)), the incoming dis- distribution appears to have shifted to a mostly bimodal persions at R2 are distributed as in the 50% curve in Fig- distribution with modes around 40 and 1500 bytes. More ure 5(b). The vast majority of these dispersions, those details are in AppendixA. greater than 0.2δinit, arein theunconstrained region when Based on this distribution, we repeated the simulations they arrive at R2. Thus, the dispersion CDF for case FF in the six cases of Figure 4 with a bimodal distribution of is like case F itself, although with slightly longer tails and packetsizesincrosstrafficonthelastlinkineachcase. The smoother edges. mix was 30% 1500 byte packets and 70% 40 byte packets. Case FS. Finally, in case FS (Figure 5(f)), pairs arriv- As expected, we observed that the effect of introducing the 6 1 host as a receiver because interrupt coalescing tainted our 0.8 measurements. F 0.6 WedidattempttousePlanetLab[4]forourexperiments, D C 0.4 but we discovered that high machine load interfered with 0.2 Bimodal our measurements. Unimodal 0 Overallwemadeabout56measurementsovertwelvemonths. 0 1 2 3 4 5 Since most measurements consist of 24 one-hour segments, Normalized dipersion we have close to 1300 hours of measurements. Figure 7 shows a small sample of these measurements, onegraph for Figure 6: Dispersion CDF from simulation case S each path we measured. In each graph, we show several with bimodal packet size distribution. dispersion CDFs taken one or more months apart. Within each path, all CDFs are from the same time of day. We do not have measurements for all paths for all months due to Table 1: Measurement sites Site Domain Abbr. unstable hosts. U.of Southern California usc.edu SC UC SantaBarbara ucsb.edu SB 4.2 Understanding Internet DispersionCDFs UC San Diego ucsd.edu SD The data in Figure 7 is taken from the “wild” Internet, U.of Mass. umass.edu UM thuswedonothavegroundtruthaboutthelinkandtraffic U.of Maryland umd.edu MD characteristics. We therefore turn to the understanding of InhaU., Korea inha.ac.kr KR smallscalenetworkeffectsondispersionCDFsdevelopedin Nat. Tech. U.of Athens,Greece ntua.gr GR Section 3 to interpret our wide-area data. T.U. Braunschweig, Germany tu-bs.de DE ThecaptionsofFigure7identifythetopologyclass(S,F, SF,FS)from Section 3webelievemost closely matchesthe dispersion CDF. We omit classes SS and FF since they are smaller packetswas to removesharp step transitions in the subsumed by classes S and F, respectively. CDFs, where they existed. For example, Figure 6 shows Tomapbetweenourexperimentaldataandthebasicanal- simulations for case S (with 50% utilization on R1–D) with ysis/simulation in Section 3 we must make some assump- both unimodal and bimodal cross traffic. All pairs in both tions about link speeds and packet size distributions in the curves were in the point-constrained region, but a greater Internet. The dispersion CDFs show that all links on our numberof possible δfinal points exists in thebimodal case. pathshaveat least 100 Mb/s capacity. Prior studies of dis- However, because a 40 byte packet is much smaller com- tributions of Internet packet sizes have consistently shown pared to a 1500 byte packet, the CDF is shaped largely by strong modes at 1500 and 40 bytes, with occasional weaker 1500 bytepackets. modes at 576 or 1200 bytes [23,28,29,32]. In Section 3.4.3 This evidence on packet size distribution and the effect we demonstrated through simulation that bimodal traffic of smaller packets indicates that dispersion CDFs on the results in slightly rounder steps than unimodal traffic. We Internet will be shaped largely by packets near 1500 bytes consider that effect when relating topology classes to wide- long. area data. Wealso label each based on howwell it matches our expectations: “expected modes”, “unexpected modes”, 4. MEASUREMENTOFDISPERSIONINTHE or “no modes”. Wedescribe these next. Expected modes. The Jan 2005 CDF from path DE- INTERNET SB and all CDFs from paths SB-KR and SC-KR directly In this section we study dispersion measurements taken match our expectations of a link with cross-traffic (classes from paths on the Internet. Our measurements span a pe- S and FS) and point-constrained operation. The strongest riod between October 2004 and October 2005 and cover 15 modes are at close-to-integer multiples of initial dispersion, Internetpathsbetweenthe8sitesshowninTable1. Wedo consistent with mostly 1500 byte cross traffic packets on not have measurements for all 56 paths between the 8 sites a 100 Mb/s link. Closer inspection shows smaller steps, due to unstable hosts. Our hosts were located on academic consistent with bimodal cross traffic that includes 40 byte sites,sothepathswemeasuredareoverInternet2andother packets. research networkswithacapacityofatleast 100 Mb/s. We Unexpected modes. Many of the CDFs show strong complement our measurements with a commercial site in modes,butnotatvaluesconsistentwith100Mb/slinks. We Section 5.1 to provideshort-term validation of our primary characterizetheseCDFs(MD-SC,UM-SC,KR-SC,SB-UM, survey;they generally confirm our long-term results. SC-UM,DE-UM,GR-SCandDE-SC)ashavingunexpected modes. We can break these into three groups with similar 4.1 Experiment Methodology features. Thefirstgroup(MD-SC,UM-SC,KR-SC,andDE- BasicexperimentparametersaredescribedinSection3.1. SC)allterminateat USC,andshowstrongmodesat about A measurement lasts for 24 hours in segments of one hour 1.6 to 1.65× the initial dispersion (final δ of 190–200 µs). each. Before and after each segment, we record traceroute ThesecondgroupallterminateatUMass(SB-UM,SC-UM, results for each path. All but one of the machines used DE-UM),andallshowmodesat1.3to1.4×initialdispersion wereon100Mb/snetworks. Pairsfromthesemachineswere (final spacing of 155–170 µs). Figure 2(a) shows the raw transmitted back-to-back, with initial dispersion being 120 distributionofaportionofaDE-UMmeasurement,showing µs. For the path (MD-SC) the sender was on a 1 Gb/s the very consistent output dispersion at certain multiples. LANduringthemeasurement,butwemaintainedthesame FinallythethirdgroupcontainsthesingletonGR–SC,with initial dispersion of 120 µs. We did not use the 1 Gb/s modes at about 1.2 and 2.3 times initial dispersion. 7 1 1 1 0.8 0.8 0.8 DF 0.6 DF 0.6 Nov 2004 DF 0.6 C 0.4 C 0.4 Jan 2005 (1) C 0.4 Jan 2005 (2) 0.2 Nov 2004 0.2 Mar 2005 0.2 Aug 2005 Jun 2005 Jun 2005 Sep 2005 0 0 0 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 Normalized dispersion Normalized dispersion Normalized dispersion (a) UCSB to Korea (SB–KR): S, ex- (b) USC to Korea (SC–KR): S, ex- (c) UMd to USC (MD–SC): S, unexpected modes. pected modes. pected modes. 1 1 1 0.8 0.8 0.8 F 0.6 F 0.6 F 0.6 D D D C 0.4 Mar 2005 C 0.4 C 0.4 Mar 2005 Jun 2005 Jun 2005 0.2 Aug 2005 0.2 Mar 2005 0.2 Aug 2005 Oct 2005 Jun 2005 Sep 2005 0 0 0 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 Normalized dispersion Normalized dispersion Normalized dispersion (d) UMass to USC (UM–SC): S, un- (e) KoreatoUSC(KR–SC):S,unex- (f)GermanytoUSC(DE–SC):S,unexpected modes. pected modes. expected modes. 1 1 1 0.8 0.8 0.8 DF 0.6 DF 0.6 NOocvt 22000044 DF 0.6 Jan 2005 C 0.4 Oct 2004 C 0.4 Dec 2004 C 0.4 Jun 2005 (1) Jun 2005 Jun 2005 Jun 2005 (2) 0.2 Aug 2005 0.2 Aug 2005 0.2 Aug 2005 Sep 2005 Sep 2005 Sep 2005 0 0 0 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 Normalized dispersion Normalized dispersion Normalized dispersion (g)UCSBtoUMass(SB–UM):S,un- (h) USC to UMass (SC–UM): S, un- (i) Germany to UMass (DE–UM): S, expected modes. expected modes. unexpectedmodes. 1 1 1 0.8 0.8 0.8 DF 0.6 DF 0.6 DF 0.6 Jan 2005 C 0.4 C 0.4 Jan 2005 C 0.4 Mar 2005 Jun 2005 Jun 2005 0.2 Jan 2005 0.2 Aug 2005 0.2 Aug 2005 Feb 2005 Sep 2005 Sep 2005 0 0 0 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 Normalized dispersion Normalized dispersion Normalized dispersion (j) GreecetoUSC(GR–SC):S,unex- (k) Germany to UCSB (DE–SB): (l) UCSD to Germany (SD–DE): pected modes. FS/SF, nomodes. F/SF, no modes. 1 1 1 0.8 0.8 0.8 DF 0.6 Feb 2005 DF 0.6 DF 0.6 C 0.4 Mar 2005 C 0.4 Oct 2004 C 0.4 Jun 2005 Jun 2005 0.2 Aug 2005 0.2 Aug 2005 0.2 Oct 2004 Sep 2005 Sep 2005 Feb 2005 0 0 0 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 Normalized dispersion Normalized dispersion Normalized dispersion (m) USC to Germany (SC–DE): SF, (n) USC to UCSB (SC–SB): SF, no (o)UMasstoUMd(UM–MD):SF,no no modes. modes. modes. Figure 7: Dispersion CDFs from Internet experiments. 8 1 Sincewedonothavegroundtruthforthesepathswecan- not fully explain them. However, our simulations provide 0.8 some basis for a hypothesis. Taking the UMass traces as F 0.6 D anexample(specificallytheSC–UMtrace),wefirstobserve C 0.4 that about 60% of pairs show dispersions around 170 µs or 0.2 slightly less. This spacing is consistent with a half alloca- 0 0 1 2 3 4 5 tionofanOC-3link(155Mb/shalvesto77Mb/s,consistent Normalized dispersion with17µsspacing);webelievethislinkrepresentsthepath bottleneck. Thethirdstrongest mode(about10% ofpack- Figure8: DispersionCDFseverytwohoursforpath ets)isattwicethisvalue(2.4δinit or34µs),consistentwith DE–SB on Friday, June 24, 2005. the pairs trapping a 1500-bytepacket on this bottleneck. Several remaining modes for this path occur at 1.1, 1.7, 2.05 and 2.75 times δinit. In fact, the mode at 1.1δinit is quantify their differences in Section 6.2. the second strongest mode with about 18% of pairs. These weaker modes are each about ±0.3δinit off of the strongest 4.4 Observation2: Distinctness modeanditsdouble. Thesemodesareconsistentwithtraffic The dispersion CDFs in Figure 7 show another interest- passing through a 300 Mb/s link. ing property: often signatures for different paths are quite The group of paths terminating at USCshows a primary distinct. For example, DE–SC, MD–SC and GR–SC, all of modearound1.65δinit (198µs),withabout80%ofthepairs which terminate at the same destination, and DE–SC and making a smooth, S-shape between 1.6 and 1.7δinit. The SC–DE are quite different. The difference between DE–SC mainsecondarymodeatabout1.02δinit. Theprimarymode andSC–DEisnotsurprisingastheydonothaveanytracer- is consistent with traffic passing through a bottleneck link oute hopsin common. around 62 Mb/s. We believe the secondary mode is caused Severalgroups,however,showstrongsimilarities. Forex- bycompression: thepairsgainalargeamountofslackfrom ample, MD–SC, UM–SC, KR–SC, DE-SC form one class the bottleneck link, then a few queue behind a router at a (terminating at USC); SB–UM, SC–UM, DE–UM form a 100 Mb/s link. secondclass(terminatingatUMass);andDE–SB(omitting While these interpretations are consistent with our anal- Jan. 2005), SD–DE (omitting Jan. 2005), SC–DE form a ysis, the non-standard bottleneck speeds imply that we do third class. We believe these similarities indicate that the not yethaveacomplete understandingofthisnetwork. We paths share common busy links. We can confirm this with are working on obtaining ground truth on thesepaths. traceroute. Weseefourcommonhopsatthedestinationfor No modes. Weconsider Jun–Sep 2005 CDFs from path the first group and four for the second. The third group DE–SB, and all CDFs from paths SD–DE, SC–DE, SC–SB has paths in both directions involving Germany so we see and UM–MD to have “no modes”, with relatively smooth common hops (at least 9) only on the two forward paths. CDFs and no clear modes (except possibly at 1, which rep- We speculate that common hops exist on the reverse path resents unchanged final dispersion.) We believe that the also, and may be detectable byalias resolution. continuous shape of these CDFs is caused by cross traffic Explanation: Dispersion CDFs are influenced by link at a fast link, consistent with case SF in Figure 3.4.2. Our speeds and cross traffic. Paths with different link speeds hypothesisisthatabusyhigh-speedlink (muchfaster than and traffic generate different dispersion CDFs, while paths 100 Mb/s) causes operation in the unconstrained region in thatsharemanylinks(particularlybottleneckorbusylinks) this class, and that any 100 Mb/s links causing significant generate similar CDFs. dispersion are upstream of this fast link. Caveat: Insomecaseswehaveobservedfingerprintchanges We now move on to overall observations drawn from the that we associate with traffic changes (rather than routing ensemble of dispersion CDFs. changes). Weexamine these next. 4.3 Observation1: Persistence 4.5 Characterizing Fingerprint Changes Thefirst observation from Figure 7is thatthedispersion Our statements about persistence and distinctness are CDFsremainfairlypersistentoverperiodsofmonths,ascan based on the claim that dispersion CDFs reflect underly- beseenbyverysimilarmodesandshapesacrossCDFsthat inglinkandtrafficcharacteristics. Thus,itisnotsurprising aremonthsapart. Forexample,pathSC–DE(Figure7(m)) that significant routing or traffic changes can result in cor- had consistent shape (80% of packets have their initial dis- responding changes in the dispersion CDFs. We have seen persion and the rest have a smooth range of other values) three kinds of changes: on three paths we observe diurnal for seven months. changes of fingerprints, on three cases we observed signifi- Explanation: Persistence of dispersion CDFs indicates cant changes between measurements, and in two cases we relatively stable underlying traffic conditions. As we have captured significant changes during our 24 hour observa- seeninSection3,backgroundtrafficintensitymainlyaffects tions. Wedescribeeach of thesebelow, focusing on thetwo the magnitude of theCDF (the y-axis) and packet size dis- cases that we captured in action. tribution affects thelocation of themodes. The stability in Althoughmostpathsareverystableregardlessofobserva- ourresultsindicatesthatneitherofthosechangesdrastically tion time, we observe diurnal changes on on paths DE–SB, in timescales of months. SD–DE and SC–DE. Figure 8 shows 12 measurements for Caveats: Forafewpaths(DE–SB,SD–DE,andSC–DE) DE–SB. This changein CDF is consistent with a change in we see a diurnal cycle as shown in Figure 8. This is well cross traffic volume consistent with our simulation cases F known phenomenon and our fingerprints are able to catch andSF.InSection6.2wequantifythesechanges. Thepres- it. We expand on these diurnal changes in Section 4.5 and enceof diurnaleffectssuggests that,if dispersion CDFsare 9 indicates that the outage corresponds to a change in the peering point that results in changed cross traffic. Finally, we also observed the converse of these examples: long-term stable paths and stable dispersion CDFs. For example, for path SC–UM, the dispersion CDFs are quite stable. The router-level path (from traceroute) shows some variation over that time (16 to 18 hops), but its structure remainsunchangedwithfiveISPs(USC,CENIC,Internet2, Northern Crossroads, and UMass) for thetwelve months of observation. Weobservedasimilarlevelofstabilityforother paths not mentioned above. Figure 9: One-hour dispersion time series on path These examples indicate that that significant changes in DE-UM. network are often associated with changes in thedispersion CDF, but that long-term measurements indicate generally stable and distinct dispersion CDFs. Based on theseobser- used to identify paths, than they must either be taken over vations, we propose to use the dispersion CDF as a path the entireday (tosmooth out traffic variation), or taken at signature. In the next sections we briefly review other fac- aconsistenttimeofday. TheplotsinFigure7confirmlong- torsthatmightaffectdispersionCDFs(Section5)andthen durationpersistenceoncediurnaleffectsarefactoredoutby turn to applications in Section 6. measuring at a consistent time of day. We saw three cases where there were significant changes 5. SENSITIVITY ANALYSIS inthemonthbetweenourobservations. Thethreepathsare DE–SB, SD–DE and DE–UM (Figures 7(k), 7(l) and 7(i)), Our approach to collection dispersion CDFs requires a all involving a source or destination in Germany, with the number of measurements. We next evaluate the sensitivity change occurring between January 2005 and later traces. ofthesemeasurementstomeasurementparameters, suchas (Our one other path involving Germany (DE–SC) does not proberateandmeasurementduration,andtoenvironmental show this change because we lack data for January 2005.) factorssuchassystemloadandclockdrift. Wefirstvalidate ExaminationoftraceroutesindicatethatthischangeinCDF that our experimental results are not biased by high-speed correspondstoachangeinpath: anextrahopappearsinall academic networks, then we look at varying measurement of the January paths. In January, each path showed some parameters on a specific path. (We use the SC–UM path level of traffic, while later traces show more traffic but a forthesemeasurements,chosenbecausewehavethelongest smooth CDF. This change is consistent with an increase in observations for this path.) bandwidth of some intermediate link with some utilization 5.1 Paths inthe Commercial Internet (forexample,perhapsachangeintheLANusedatapeering point). The dispersion measurements presented in Section 4 be- Ofthetwomajorchangesduring anobservation,thefirst ginandendathostsinacademicnetworks,andsothepaths was on path SC–KR, during a 24-hour observation in Jan- oftenexploitresearchnetworkssuchasInternet2. Internet2 uary2005. DispersionsCDFsbeforethechange,fromNovem- isdesignedtobeveryhighspeed,andperhapsbusinesscon- ber 2004 and January 2005 are similar, and have a large cernsforcecommerciallinkstoberunathigherutilizations. mode at 2δinit. CDFs after the change are also consistent To investigate if our observations hold on commercial net- with each other, with 99% of dispersions nearly unchanged worksaswellasacademicnetworkswetooktookshort-term (within δinit and 1.06δinit). During January 2005 we ob- measurements from a host placed in in the commercial ad- served a 2.5 hour outage, with a new signature after the dress space of Los Nettos, a regional network, labeled here outage. The route reported by traceroute did not change “LN”. (but both routes show four anonymous hops). The outage We measured paths UM–LN, MD–LN, SB–LN, DE–LN may indicate a hardware change and the change in CDF andSD–LN,andthereversepathLN–UM.Weverifiedthat indicatesthatfewerpairsexperiencedlargeincreasesindis- each of these paths goes over non-Internet2, commercial persionafterthechange. ThischangeinCDFsuggestspres- links. We do not reproduce all this data here, but in Fig- enceoftrafficbeforeandthereductionoftrafficafterwords, ure10weshowdispersionCDFsforpathsDE–LN,andLN– or perhaps an increase in link capacity. UM. Our last example of a major signature change during an During a week of observation, we found that the disper- observation wasonpathDE–UMinJune2005. Weobserve sion CDFs on these paths had similar behavior as those a stable path (call it R1), followed by an alternate route on the academic paths—CDFs are persistent and in gen- (R2) on Thu June 23 2005 between hours 12 and 13 of the eral different on different paths. These experiments serve 24-hour observation period, reverting to the original route as preliminary verification of dispersion CDF properties on afterhour13. TheR1–R2changekeepsthesamedispersion commercial paths. CDF,butthereturnofrouteR1(afterhour13)isconcurrent 5.2 Effect ofPacketPairRate with a very different dispersion CDF than before hour 12 (see Figure 9 around time 2500 s). This condition persists Throughout this paper we use the relatively high probe throughtherestofJune,butisgoneinourAugustandlater rate of 100 pairs/s (with exponential interarrivals); a data measurements. This case is very puzzling, since the route rate of 2.4 Mb/s. We chose this rate to get good accu- changedoesnotcorrespondtotheCDFchange. Considering racy without great overhead relative to the 100 Mb/s link Figure9,weseethatthemodesbeforeandafteraresimilar, capacity or more. However, a lower probe rate would be but the strengths of each band vary greatly. Perhaps this attractivetoreducetheoverhead. Herewetesttheaffectof 10

Description:

incoming links have the same capacity Cpre, while the outgoing link may ence of traffic before and the reduction of traffic afterwords, or perhaps

Fingerprinting Internet Paths using Packet Pair Dispersion PDF

15 Pages·2006·0.47 MB·English

Checking for file health...

Save to my drive

Quick download

Download

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Fingerprinting Internet Paths using Packet Pair Dispersion

Description:

incoming links have the same capacity Cpre, while the outgoing link may ence of traffic before and the reduction of traffic afterwords, or perhaps

See more

The list of books you might like

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.