Table Of ContentImproving End-to-End Availability Using Overlay Networks
by
David Godbe Andersen
S.M.ComputerScience,MassachusettsInstituteofTechnology(2001)
B.S.,ComputerScience,UniversityofUtah(1998)
B.S., Biology,UniversityofUtah(1998)
SubmittedtotheDepartmentofElectricalEngineeringandComputerScience
inpartialfulfillmentoftherequirementsforthedegreeof
DoctorofPhilosophyinComputerScience andEngineering
atthe
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
February 2005
(cid:1)c MassachusettsInstituteofTechnology2004. Allrightsreserved.
Author...........................................................................
DepartmentofElectricalEngineeringandComputerScience
December22,2004
Certified by.......................................................................
HariBalakrishnan
AssociateProfessorofComputerScienceandEngineering
ThesisSupervisor
Acceptedby......................................................................
ArthurC.Smith
Chairman,DepartmentCommitteeonGraduateStudents
2
ImprovingEnd-to-End AvailabilityUsingOverlayNetworks
by
DavidGodbeAndersen
Submitted totheDepartment ofElectrical Engineering andComputerScience
onDecember22,2004,inpartial fulfillment ofthe
requirements forthedegree of
DoctorofPhilosophy inComputer ScienceandEngineering
Abstract
Theend-to-end availability ofInternetservicesisbetweentwoandthreeordersofmagnitudeworse
than other important engineered systems, including the US airline system, the 911 emergency re-
sponse system, and the US public telephone system. This dissertation explores three systems de-
signed to mask Internet failures, and, through a study of three years of data collected on a 31-site
testbed, whythesefailures happen andhoweffectively theycanbemasked.
A core aspect of many of the failures that interrupt end-to-end communication is that they fall
outside the expected domain of well-behaved network failures. Many traditional techniques cope
with link and router failures; as a result, the remaining failures are those caused by software and
hardware bugs, misconfiguration, malice, or the inability of current routing systems to cope with
persistentcongestion. TheeffectsofthesefailuresareexacerbatedbecauseInternetservicesdepend
upon the proper functioning of many components—wide-area routing, access links, the domain
name system, and the servers themselves—and a failure in any of them can prove disastrous to the
proper functioning oftheservice.
This dissertation describes three complementary systems to increase Internet availability in the
face of such failures. Each system builds upon the idea of an overlay network, a network created
dynamicallybetweenagroupofcooperatingInternethosts. Thefirsttwosystems,ResilientOverlay
Networks (RON) and Multi-homed Overlay Networks (MONET) determine whether the Internet
path between two hosts is working on an end-to-end basis. Both systems exploit the considerable
redundancy available in the underlying Internet to find failure-disjoint paths between nodes, and
forwardtrafficalongaworkingpath. RONisabletoavoid50%oftheInternetoutagesthatinterrupt
communication between a small group of communicating nodes. MONET is more aggressive,
combining an overlay network of Web proxies with explicitly engineered redundant links to the
Internet to also mask client access link failures. Eighteen months of measurements from a six-site
deployment of MONET show that it increases a client’s ability to access working Web sites by
nearly anorder ofmagnitude.
WhereRONandMONETcombataccidentalfailures,theMaydaysystemguardsagainstdenial-
of-service attacks bysurrounding avulnerable Internet server witha ring offiltering routers. May-
daythenusesasetofoverlaynodestoactasmediatorsbetweentheserviceanditsclients,permitting
onlyproperly authenticated traffictoreachtheserver.
ThesisSupervisor: HariBalakrishnan
Title: Associate Professor ofComputerScience andEngineering
3
4
Tomyparents, MaryLouGodbeandJerryRichardAndersen,
andtomygrandfather HamptonClawsonGodbe,
whoalwaysencouraged metodiscover things,
instilling inmethecuriosity tobecomeascientist,
andtheimpatience tobecome acomputer scientist.
5
6
Acknowledgments
I am deeply indebted to my advisor, Hari Balakrishnan, for making five years of graduate school
one of the best periods of my life. For each effort I put into my research and this dissertation, I
think Hari put two. I arrived at MIT quite without a clue, and rather worried that the admissions
committee wouldsoonrealize their drastic errorininviting me. Fortunately, itwasdifficult tostray
toofarafieldwhenstrivingtofollowHari’sexampleandexemplaryadvice. Hewasabetteradvisor
thanIimagined possible.
Frans Kaashoek let me steal an enormous amount of his time and wisdom as I walked a line
betweennetworks andsystems. Frans’svigorous approach bothtoresearch andtolifehelpedshow
methefunthatcanbehadinacademia. IbenefitedenormouslyfromthetimeIspentvisitingPDOS
group meetings andgetting aninfusion ofhard-core systems.
In addition to so generously serving on my committee, and on my typically short notice, John
Guttagtaughtmemanyoftherightquestions toaskaboutresearch. Iknowofnobodybettertoturn
towhenwondering, “What’s thereally important thingabout whatI’mdoing here?”
IhavetothankRobertMorrisfortwothings: First,asoneofthepeoplebehindtheRONproject,
forhisearlyguidanceandfrequentadvice;andsecond,forsharinghispassionforallthingstiming-
related and letting me frequently barge in and borrow a rubidium oscillator or frequency counter
whenIwasavoiding myrealwork.
Alex Snoeren, my officemate for my first three years at MIT, showed me how to swim, both
in research and socially, in a very different pool than the one from which I came. He was a great
mentor, office-mate, and a damn smart blackboard off which to bounce my crazy idea or dumb
question of the week. He’s a great friend to have. Thank you, Alex. On the topic of swimming, I
oweChristine Alvarado forsimilarhelp, both asanoccasional training partner andteacher, andfor
successfully juggling her own Ph.D. research while encouraging those around her to occasionally
fleetheconfines oftheircubicles tomeetotherpeople.
Alex,AllenMiu,StanRost,andJonSalzmadeourofficeinroom512anexcellent—andloud—
placetowork. Stan,thankyouforthemusic,andalltherest. Allen,thankyouformakingmenotice
thebadminton courts. I’llmissourdailyinteraction.
Nick Feamster tolerated me as both a friend, an officemate and an apartment-mate, and did an
excellent job at all of the above. I may have gotten more research done while running and cycling
with Nick than I did in the office. Where my research touched BGP, Nick was right there. We
built a crazy measurement infrastructure, and hopefully, we’ll keep on building it. Onan unrelated
note, I dedicate the eradication of the word “methodology” in this dissertation both to Nick and to
Michael Walfish. If I ever manage to write half as well as Walfish, I’ll count myself happy. My
future sushi parties, however, must be dedicated to Jaeyeon Jung. Mythili Vutukuru and Sachin
Kattialsodisplayed impressive noise-tolerance during ourshort timeofsharing anoffice.
I spent my time at MIT in the Networks and Mobile Systems group, whose members’ diverse
research interests always let me learn something new, whether about wireless networks, streaming
databases, BGP routing, or whatnot. Brett Hull, Kyle Jamieson, Bodhi Priyantha, Eugene Shih,
andGodfreyTanalldeserve blamefordoing coolthings andbeingwillingtoteach meaboutthem.
Rohit Raohelped write thesecond and muchimproved versions oftheMONETproxy andparallel
DNS server, thereby greatly improving my chances of graduating. Magda and Michel, I raise a
Costco membership card in your honor. Dina Katabi provided much good feedback about nearly
everything Iwondered about.
EddieKohlerandDavidMazie`resshowedmethetruePDOSspirit,carriedonbyJohnJannotti,
BryanFord,KevinFu,MichaelKaminsky,AthichaMuthitacharoen, EmilSit,BenjieChen,Jinyang
Li Max Krohn, and Jacob Strauss. Sameer and Mandi, thank you for many a Christmas party, and
7
many a good conversation. Rodrigo and Chuck, don’t stop debunking peer-to-peer. Thank you to
ChrisLaasforneverlettingmeBS,andtoDan,John,FrankandStriblingforHalo. DougDecouto,
JohnBicketandSanjitBiswas,DanielAguayo,andBenChambers’sRoofnetshowedmethatgood
workcan(andshould)beinspirationalandextremelycool. Doug,goodwindforyoursails. Thomer
GilandAlexYipprovided inspiration inresearch and cycling.
DorothyCurtisandMichelGoraczkoputupwithmyconstantrequests formorediskspace,my
occasionally breaking the network, and the many outrageous requests Imade at ridiculous hours. I
placed a similar burden on the LCS infrastructure group; I give particular thanks to Garrett Woll-
mannandMaryAnnLadd,whocarvedthroughtheadministrativebureaucracytoletmeinstallDSL
lines,collectroutingdata,andaccessdedicatedInternetconnections. JayLepreau,MikeHibler,and
LeighStollerandtherestoftheEmulabcrewprovidedinvaluableassistancewithmanagingandrun-
ning the RONtestbed. Thanks also to the manyanonymous users atMITwhoused and abused the
proxy toprovide uswithdatatoanalyze MONET.
MIT has an amazing collection from whom to learn, and a group of professors and research
scientists who shared generously with their much-demanded time. David Karger did his best to
teachmesometheory,evenifthatwasliketryingtoteachapigtofly. DavidGiffordintroduced me
tothejoysofmachinelearningthroughcomputational biology. JohnWroclawskiandKarenSollins
somehowalwaysmadetimetotalktomeabout things network architecture andotherwise.
Myresearch wassupported byaMicrosoft Research fellowship, DARPA,theNational Science
Foundation, andMIT.Manythanks toallformakingthis research possible.
Mymother,MaryLouGodbe,bravely andpatiently proof-read manyofthepapersIwrote,and
proof-read a draft version of this dissertation, steeping herself deeper in computer science jargon
thanI’msuresheeverimagined.
While I’m deliberately leaving out many good friends and family from this list (the list of
academic people is too long already—and I fear it’s drastically incomplete!), I do owe a heart-
felt thank you to Naomi Kenner for supporting me during my job search and related trauma that
temporarilyturnedmeintoastressed-outbasketcase. Thankyou,Ny. Yourock. Manythankstomy
friendsfromtheouting,cycling,triathlon,andmastersswimmingclubs. Youkeptmecomparatively
sane for five years. Rob Jagnow was an excellent friend and adventure partner, who also provided
a sterling example of how many responsibilities one can successfully juggle if one really tries.
Between Rob, Hector Briceno, Luke Sosnowski, and Robert Zeithammer, I was never lacking for
friends orinspiration todocrazy things involving themountains.
My friends and family, named and unnamed, form the richest tapestry in which anyone could
hopetolive. Youknowwhoyouare,andplease knowhowmuchIappreciate you all.
8
Contents
1 Introduction 15
1.1 End-to-End Service Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.2 Challenges toAvailability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.3 Coping WithFailuresandDegradation . . . . . . . . . . . . . . . . . . . . . . . . 23
1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2 RelatedWork 29
2.1 Availability andPerformance ofInternet Systems . . . . . . . . . . . . . . . . . . 29
2.2 Internet Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.3 Improving End-to-End Availability . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4 Denial ofService . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.5 Overlay Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3 Method 39
3.1 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2 Measurement andValidation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3 Failures andPerformance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4 TheRONTestbed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.5 Testbed Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.6 Closely RelatedTestbeds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4 ResilientOverlay Networks 51
4.1 Design Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2 Exploratory StudyResults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3 SoftwareArchitecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.4 Routing andPathSelection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.5 PolicyRouting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.6 DataForwarding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.7 Bootstrap andMembership Management . . . . . . . . . . . . . . . . . . . . . . . 63
4.8 Implementation andtheRONIPTunnel . . . . . . . . . . . . . . . . . . . . . . . 63
4.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5 RONEvaluation 71
5.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2 Overcoming PathOutages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.3 Overcoming Performance Failures . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.4 RONRouting Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
9
5.5 Security Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6 Multi-homedOverlay Networks 87
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.2 SystemArchitecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.3 Obtaining Multiple Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.4 Waypoint Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.5 Reducing ServerOverhead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.6 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.7 Explicit Multi-homing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.8 ICP+withConnection Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
7 MONETEvaluation 103
7.1 MONETTestbed andDataCollection . . . . . . . . . . . . . . . . . . . . . . . . 103
7.2 Characterizing Failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.3 HowWelldoesMONETWork? . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
7.4 ServerFailuresandMulti-homed Servers . . . . . . . . . . . . . . . . . . . . . . 114
7.5 Discussion andLimitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
8 PreventingDenial-of-Service AttacksusingOverlay Networks 119
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
8.2 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
8.3 Attacks andDefenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
8.4 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
8.5 Practical Deployment Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
8.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
9 Conclusion 135
9.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
9.2 OpenIssues andFutureWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
10
Description:utilities to split, merge, and gather subsets of these files. As noted below, using a database system with common schemas facilitates tool reuse. 3.2.5 Database Systems. The analysis in this dissertation makes extensive use of the MySQL relational database for process- ing data. While a database ma