ebook img

Missing measurements on RIPE Atlas PDF

0.99 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Missing measurements on RIPE Atlas

Missing measurements on RIPE Atlas A view from probe connection activity Wenqin Shao*, Jean-Louis Rougier*, François Devienne† and Mateusz Viste† *TelecomParisTech,† Border6 ABSTRACT 1.00 7 Weshowthatitiscommontolosesomedatapointsformea- 0.75 1 surementsscheduledatregularintervalonRIPEAtlas. The DF0.50 0 temporalcorrelationbetweenmissingmeasurementsandcon- C 0.25 2 nectioneventsareanalyzed,inthepursuitofunderstanding 0.00 n reasonsbehindsuchmissings. Tooursurprise,abigpartof 0 2 10 100 1000 10800 a measurementsarelostwhileprobesareconnected. # of missing measurement J Figure 1: CDF of total missing length per probe. 4 1. INTRODUCTION WeobservedtheRIPEAtlasplatformforonemonth, ] RIPE Atlas, a public measurement platform, enables from 2016-06-01 to 2016-07-01 UTC. All the v3 probes I N various application such as performance monitoring [8, firstconnectedbeforethebeginningdate(11613ofthem) 18], anomalies detection [9, 17, 14], peering and IXP are considered. Connection events (id 7000) and built- . s measurements [4, 16] etc. As it continues to gain pop- inPingmeasurementstoDNSb-root(id1010),ahighly c [ ularity among network operators and researchers, its available destination, are collected [1]. Controllers and measurement quality becomes a natural concern. It is the ping destination are not within the same network. 1 nowknownthatloadhaveobviousimpactsonmeasure- Controller logs the moments at which probes connects v 8 ment precision and scheduling [10, 6]. to and disconnects from it. The built-in ping mea- 3 We focus on data completeness, another aspect of surement is scheduled on every probe at 4min interval. 9 measurementqualitythatreceivedlessattentionsofar. 10800 ping results are thus expected from each probe 0 Missingmeasurementscancausevariousundesiredcon- within the month. 7353 probes, out of the available 0 sequences. Apart from widening confidence interval of 11613, had Ping measurements during this period. . 1 inference[9],itrequiresingeneralmethodologicaladap- 0 3. MISSINGATFIRSTGLANCE tations, e.g. in spectrum analysis [5, 15, 19], otherwise 7 1 biased estimation would be expected [7]. 4440probes(60.4%)missnomorethan2datapoints, : One obvious reason of missing is that the probe is which is totally legitimate, as random jitter is added v not running (properly), e.g. power off [2]. As long as to each single measurement to avoid synchronization i X a probe is powered, it tries to maintain a connection to among probes. For the rest, the missing length spans r acontrollertoreportmeasurementsandreceiveassign- a wide range according to Fig. 1. 1358 (18.5%) probes a ments. Therefore the probe connection activity pro- miss more than 10% of the total measurements (i.e. 72 vides a good indication of the probe availability, and hours over a month). is used in current investigation conducted by RIPE on probe OS stability [13, 11, 20]. 4. CROSSWITHCONNECTIONEVENTS In order to infer other possible causes, we crossed Several reasons may contribute to the disconnection the measurement timestamps with the moments probe ofaprobetoitscontroller: 1)probenotworking(prop- connects to and disconnects from a Atlas controller. erly); 2) network issues preventing the connection; 3) If measurement missing coincides with the probe dis- controller not available, e.g. during maintenance [3]. connection, chances are that the probe is dysfunctional Meanwhile, the last two reasons shall not prevent a during the missing. However, if measurements are lost probe from performing built-in measurements, as the whiletheprobeiswellconnected,something‘abnormal’ results can be unreachable or timeout, and be stored should be expected, beyond the known probe OS issue. locally on the probe [12]. That is to say, missing does not necessarily occur when probe is disconnected, but is 2. DATACOLLECTION unexpected while probe is connected. 1 4.1 Overlapwithconnectedperiod l overlapped l shifted 10800 s nt (D/C, C) 5793 the rest 8314 me 1000 e ur 100 1d s # mea 102 T ... 4m1ihn H shifted overlapped G 0sec Figure 2: Missing length distribution. RI ... −4min We count, for each missing segment, the number of −1h −1d missingmeasurementsthatoccurredduringaconnected cpoeurinotd.byWtheeolbetnaginththofemovisesrilnagp sreagtimoenbtys.diTvihdeindgistthriis- −1d−1h−4min 0sec 4min1h 1d −1d−1h−4min 0sec 4min1h 1d bution of overlap ratio is concentrated at the two ends, Time from LEFT edge to the nearest connection event 0 and 1. For the convenience of illustration, we cut Figure 3: (D/C, C) stands for missing segments more missing segments into two groups, one with overlap ra- closelycorrelatedwithdisconnectedperiod. Numberof tio ≤ 0.5, denoted as shifted, the other with the rest, concerned missing segments is given in the title. Neg- denotedasoverlapped. Measurementmissingthatover- ative time distance means the edge happens before the laps connected period is ‘unexpected’. connection event and vice verse. The two groups demonstrate different length distri- Ping interval (4min), and recovery tends to take place bution profiles, Fig. 2. 15391 missing segments are 4 minute after the connection. Such strong correlation observed. 10292 (66.87%) missing segments are over- with probe disconnected period indicates that probe lapped with connected period. They are mostly short dysfunction is probably the cause. in length. 5560 of them last no more than 2 measure- However,thebeginningofsuchmissingsegments(D/C, ments. Onepossibleexplanationisthatthesemeasure- C) can as well be dislocated from disconnection event. ments are skipped due to scheduling or load issues [2, At the left end of the graph, measurements are lost 10]. Meanwhile, 2490 of them are equal to or longer longbeforethedisconnection,whichwefind‘abnormal’, than 1 hour, involving only 620 probes, for which we even though the recovery is near connection. believe that the previous explanation hardly applies. Totherightend, measurementsonlybegintobelost Missing segments shifted from connected period are a long time after the probe is disconnected. One pos- more likely to be long. This is possibly due to the v3 sible explanation is that measurements are first stored probe OS stability issue still under investigation. It is locally after disconnection from controller [12]. Then known to be responsible for long term probe discon- new measurements are lost after local storage is full. nection and requires manual operation to recover the Contrary to the compact distribution in (D/C, C) probe [12, 13, 11, 20]. sub-graph, the majority of the rest missing segments spreadsalongthetimeaxis. Thedistancesfromleftand 4.2 Temporalcorrelation right edge to connection events are highly correlated, suggesting that both left and right edges are on the Toobtainasubtlerview,weseektofindout: whendo same side of a same connection event. We note that measurementsbegintobelostandwhenaretheyrecov- thesemeasurementsweremostlylostwhileprobeswere ered? Are these moments close to connection events? well connected, i.e. overlapped. First, we define the left edge as the last measure- ment before a period of measurement absence, and the 5. CONCLUSION right edge, accordingly, the first measurement after re- covery. We then calculate the time interval separating Inouranalysiscoveringalargenumberofprobesover these edges and the closest connection events. We also one month, only 60% of v3 Atlas probes have complete identify the nature of these events, and mark ‘D/C’ for measurements. Around1/3missingsegmentsappearto disconnection, ‘C’ for connection. closelycorrelatedtodisconnectedperiod. TheprobeOS 1284 missing segments locate at the beginning or the stability issue might have contributed to such missings, end ofthe observation period. Theyare unavailablefor as suggested by the heavy tail of the missing lengths. this analysis as only one edge can be observed. For the However,theremaining2/3ofmissingsoccurredwhile rest with both edges, 5793 missing segments’ left edge probes are connected. Half of them are no more than is closer to a disconnection event and the right edge is 2 measurements in length, and are thus likely to be closer to a connection event, according to Fig. 3 sub- caused by scheduling issues. However, around 25% of graph(D/C,C).Judgingfromthedensitycontour,last this category lasts long(≥ 1h). The reason of such an measurement most likely precedes disconnection by a unexpected behavior is still to be determined. 2 6. REFERENCES Detecting asymmetric routing over ixps. https://github.com/inex/ixp-as. [1] Built-in measurements. [17] R. Padmanabhan, P. Owen, A. Schulman, and https://atlas.ripe.net/docs/built-in/. N. Spring. Timeouts : Beware Surprisingly High [2] Ripe forum [atlas] actual measurement interval Delay Categories and Subject Descriptors. In much larger than planned. https://www.ripe. IMC, pages 303–316, 2015. net/participate/mail/forum/ripe-atlas. [18] M. Rimondini, C. Squarcella, and G. D. Battista. [3] Ripe forum [atlas] new on ripe labs: Another look Towards an Automated Investigation of the at ripe atlas probe lifetimes. https://www.ripe. Impact of BGP Routing Changes on Network net/participate/mail/forum/ripe-atlas. Delay Variations. PAM, (317647):193–203, 2014. [4] E. Aben. Measuring countries and ixps with ripe [19] W. Shao, J.-L. Rougier, F. Devienne, and atlas. https://labs.ripe.net/Members/ M.Viste.ImproveRound-TripTimeMeasurement emileaben/measuring-ixps-with-ripe-atlas. Quality via Clustering in Inter-Domain Traffic [5] P. Babu and P. Stoica. Spectral analysis of Engineering. (AnNet):1105–1108, 2016. nonuniformly sampled data – a review. Digital [20] R. Wilhelm. Another look at ripe atlas probe Signal Processing, 20(2):359–378, 2010. lifetime. [6] V. Bajpai and S. J. Eravuchira. Lessons Learned https://labs.ripe.net/Members/wilhelm/ from using the RIPE Atlas Platform for another-look-at-ripe-atlas-probe-lifetimes. Measurement Research. ACM SIGCOMM Computer Communication Review, 45(3):35–42, 2015. [7] A. N. Baraldi and C. K. Enders. An introduction to modern missing data analyses. Journal of School Psychology, 48(1):5–37, 2010. [8] M. Candela. Latencymon. https://labs.ripe.net/Members/massimo_ candela/new-ripe-atlas-tool-latencymon. [9] R. Fontugne, E. Aben, C. Pelsser, and R. Bush. Pinpointing Delay and Forwarding Anomalies Using Large-Scale Traceroute Measurements. 2015, 2016. [10] T. Holterbach, C. Pelsser, R. Bush, and L. Vanbever. Quantifying Interference between Measurements on the RIPE Atlas Platform. IMC, pages 437–443, 2015. [11] P. Homburg. Further analysis of ripe atlas version 3 probe. https: //labs.ripe.net/Members/philip_homburg/ further-analysis-of-ripe-atlas-version-3-probe. [12] P. Homburg. Troubleshooting ripe atlas probe usb sticks. https: //labs.ripe.net/Members/philip_homburg/ troubleshooting-ripe-atlas-probes-usb-sticks. [13] P. Homburg. A visual impression of probe lifetime. https: //labs.ripe.net/Members/philip_homburg/ a-visual-impression-of-probe-lifetimes. [14] S. Kerr, D. Quinn, and D. Milosevic. Halo: Network outages dashboard. https://github. com/RIPE-Atlas-Community/ripe-atlas-halo. [15] M. Luckie, A. Dhamdhere, D. Clark, B. Huffaker, and K. Claffy. Challenges in Inferring Internet Interdomain Congestion. In IMC, pages 15–22, New York, New York, USA, nov 2014. ACM Press. [16] B. O’Donovan, D. Taylor, and J. Drabczyk. 3

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.