Appearedin7thUSENIXSymposiumonNetwork DesignandImplementation(NSDI’10) Experiences with CoralCDN: A Five-Year Operational View MichaelJ.Freedman PrincetonUniversity Abstract convenience and availability. CoralCDN has since re- mainedpubliclyavailableformorethanfiveyearsathun- CoralCDN is a self-organizing web content distribution dredsofPlanetLabsitesworld-wide.Accountingforama- network(CDN).PublishingthroughCoralCDNisassim- jorityofpublicPlanetLabtrafficandusers,CoralCDNtyp- ple as making a small change to a URL’s hostname; a icallyservesseveralterabytesofdataperday,inresponse decentralizedDNSlayertransparentlydirectsbrowsersto totensofmillionsofHTTPrequestsfromaroundtwomil- nearbyparticipatingcachenodes,whichinturncooperate lionusers(uniqueclientIPaddresses). tominimizeloadontheoriginwebserver. CoralCDNhas Over the course of its deployment, we have come to been publicly available on PlanetLab since March 2004, acknowledge several realities. On a positive note, Coral- accounting for the majority of its bandwidth and serving CDN’snotablysimpleinterfaceledtowidespreadandin- requestsforseveralmillionusers(clientIPs)perday. This novative uses. Sites began using CoralCDN as an elas- paper describes CoralCDN’s usage scenarios and a num- ticinfrastructure,dynamicallyredirectingtraffictoCoral- berofexperiencesdrawnfromitsmulti-yeardeployment. CDNattimesofhighresourcecontentionandpullingback Theselessonsrangefromthespecifictothegeneral,touch- astrafficlevelsabated. Ontheflipside,fundamentalparts ing on the Web (APIs, naming, and security), CDNs (ro- of CoralCDN’s design were ill-suited for its deployment bustnessandresourcemanagement),andvirtualizedhost- andthemajorityofitsuse. Ifoneweretoconsiderthevar- ing(visibilityandcontrol). Weidentifydesignaspectsand iousreasonsforitsuse—forresurrectinglong-unavailable changesthathelpedCoralCDNsucceed,yetalsothosethat sites,supportingrandomsurfing,distributingpopularcon- provedwrongforitscurrentenvironment. tent, and mitigating flash crowds—CoralCDN’s design is insufficient for the first, unnecessary for the second, and 1 Introduction overkillforthethird,atleastgivenitscurrentdeployment. But diverse and unanticipated use is unavoidable for an The goal of CoralCDN was to make desired web content opensystem,yetopennessisanecessarydesignchoicefor available to everybody, regardless of the publisher’s own handlingthefinalflash-crowdscenario. resources or dedicated hosting services. To do so, Coral- This paper provides a retrospective of our experience CDNprovidesanopen,self-organizingwebcontentdistri- buildingandoperatingCoralCDNoverthepastfiveyears. bution network (CDN) that any publisher is free to use, Our purpose is threefold. First, after summarizing Coral- without any prior registration, authorization, or special CDN’s published design [14] in Section §2, we present configuration. PublishingthroughCoralCDNisassimple data collected over the system’s production deployment asappendingasuffixtoaURL’shostname,e.g.,http:/ andconsideritsimplications. Second,wediscussvarious /example.com.nyud.net/. ThisURLmodification deployment challenges we encountered and describe our maybedonebyclients,originservers,orthirdpartiesthat preferred solutions. Some of these changes we have im- link to these domains. Clients accessing such Coralized plementedandincorporatedintoCoralCDN;othersrequire URLs are transparently directed by CoralCDN’s network adoption by third-parties. Third, given these insights, we of DNS servers to nearby participating proxies. These revisittheproblemofbuildingasecure,open,andscalable proxies,inturn,coordinatetoservecontentandthusmin- contentdistributionnetwork. Morespecifically,thispaper imizeloadonoriginservers. addressesthefollowingtopics: CoralCDN was designed to automatically and scalably • ThesuccessofCoralCDN’sdesigngivenobservedus- handle sudden spikes in traffic for new content [14]. It agepatterns(§3). Ourverdictismixed: Alargema- canefficientlydiscovercachedcontentanywhereinitsnet- jority of its traffic does not require any cooperative work, and it dynamically replicates content in proportion caching at all, yet its handling of flash crowds relies toitspopularity. Bothtechniqueshelpminimizeoriginre- onsuchcooperation. questsandsatisfychangingtrafficdemands. Whileoriginallydesignedfordecentralizedandunman- • Web security implications of CoralCDN’s open API aged settings, CoralCDN was deployed on the PlanetLab (§4). Through its open API, sites began leveraging research network [27] in March 2004, given PlanetLab’s CoralCDN as an elastic resource for content distri- 1 bution. Yet this very openness exposed a number of Client Resolver web security challenges. Many can be attributed to 22 1 5 alackofexplicitnessforspecifyingappropriatepro- tection domains, and they arise due to violations of CoralCDN CoralCDN CoralCDN CoralCDN HTTP Proxy DNS Server traditionalsecurityprinciples(suchasleastprivilege, HTTP Proxy DNS Server completemediation,andfail-safedefaults[33]). Coral index node Coral index node 33 66 • Resource management in CDNs (§5). CoralCDN 4 commonly faced the challenge of interacting with CoralCDN HCToTrPa lPCrDoNxy DCNoSra SleCrDvNer CoralCDN oversubscribed and ill-behaved resources, both re- Coral index node HTTP Proxy Coral index node moteoriginserversanditsowndeploymentplatform. Various aspects of its design react conservatively to Figure1: ThestepsinvolvedinservingaCoralizedURL. changeandperformadmissioncontrolforresources. 2.1 Systemoverview • Desired properties for deployment platforms (§6). Application deployments could benefit from greater At a high level, the following steps occur when a client visibility into and control over lower layers of their issuesarequesttoCoralCDN,asshowninFigure1. platforms. Some challenges are again confounded 1. Resolving DNS. A client resolves a “Coralized” when information and policies cannot be expressed domain name (e.g., of the form example.com. explicitlybetweenlayers. nyud.net)usingCoralCDNnameservers.ACoral- CDN nameserver probes the client to determine its • Directions for building large-scale, cooperative round-trip-time and uses this information to deter- CDNs (§7). While using decentralized algo- mineappropriatenameserversandproxiestoreturn. rithms, CoralCDN currently operates on a centrally- administered,smaller-scaletestbedoftrustedservers. 2. Processing HTTP client requests. The client sends Werevisitthechallengeofescapingthissetting. an HTTP request for a Coralized URL to one of the returnedproxies. Iftheproxyiscachingthewebob- RatherthanfocusonCoralCDN’sself-organizingalgo- jectlocally, itreturnstheobjectandtheclientisfin- rithms,themajorityofthispaperanalyzesCoralCDNasan ished. Otherwise, the proxy attempts to find the ob- exampleofanopenwebserviceonavirtualizedplatform. jectonanotherCoralCDNproxy. Assuch,theexperienceswedetailmayhaveimplications 3. Discoveringcooperative-cachedcontent. Theproxy toawideraudience,includingthosedevelopingdistributed looksuptheobject’sURLintheCoralindexinglayer. hash tables (DHTs) for key-value storage, CDNs or web 4. Retrievingcontent. IfCoralreturnstheaddressofa services for elastic provisioning, virtualized network fa- nodecachingtheobject,theproxyfetchestheobject cilities for programmable networks, or cloud computing fromthisnode. Otherwise, theproxydownloadsthe platforms for virtualized hosting. While many of the ob- objectfromtheoriginserverexample.com. servationswereportareneithernewnorsurprisinginhind- sight,manyrelatetomistakes,oversights,orlimitationsof 5. Servingcontenttoclients. Theproxystorestheweb CoralCDN’soriginaldesignthatonlybecameapparentto objecttodiskandreturnsittotheclientbrowser. usfromitsdeployment. 6. Announcingcachedcontent. Theproxystoresaref- WenextreviewCoralCDN’sarchitectureandprotocols; erencetoitselfinCoral,recordingthefactthatisnow amorecompletedescriptioncanbefoundin[14]. Allsys- cachingtheURL. temdetailspresentedafter§2weredevelopedsubsequent ThissectionreviewsthedesignoftheCoralindexinglayer to that publication. We discuss related work throughout andtheCDN’sproxies,asproposedin[14]. thepaperaswetouchondifferentaspectsofCoralCDN. 2.2 Coralindexinglayer 2 Original CoralCDN Design TheCoralindexinglayeriscloselyrelatedtothestructure andorganizationofdistributedhashtableslikeChord[34] The Coral Content Distribution Network is composed of andKademlia[23],withthelatterservingasthebasisfor threemainparts:(1)anetworkofcooperativeHTTPprox- its underlying algorithm. The system maps opaque keys ies that handle client requests from users, (2) a network ontonodesbyhashingtheirvalueontoaflat,semantic-free of DNS nameservers for nyud.net that map clients to identifier (ID) space; nodes are assigned identifiers in the nearby CoralCDN HTTP proxies, and (3) the underlying sameIDspace.Itallowsscalablekeylookup(inO(log(n)) Coralindexinginfrastructureandclusteringmachineryon overlay hops for n-node systems), reorganizes itself upon whichthefirsttwoapplicationsarebuilt. Thispapercon- networkmembershipchanges,andprovidesrobustbehav- sistentlyreferstothesystem’sindexinglayerasCoral,and ioragainstfailure. theentirecontentdistributionsystemasCoralCDN. 2 Compared to “traditional” DHTs, Coral introduced a 22 33 few novel techniques that were well-suited for its partic- 11 ular application [13]. Its key-value indexing layer was 4 designed with weaker consistency requirements in mind, and its lookup structure self-organized into a locality- optimizedhierarchyofclustersofpeers. Afterall,aclient 000… Distance to key 111… need not discover all proxies caching a particular file, it 4 Thresholds only needs to find several such proxies, preferably ones None nearby. LikemostDHTs,Coralexposesputandgetoper- 3 ations,toannounceone’saddressascachingawebobject, < 80 ms andtodiscoverotherproxiescachingtheobjectassociated 2 withaparticularURL,respectively. Insertedaddressesare 1 < 30 ms soft-statemappingswithatime-to-live(TTL)value. Figure 2: Coral’sthree-levelhierarchicaloverlaystructure. Anode Coral’s put and get operations are designed to spread first queries others in its level-2 cluster (the dotted rings), where load,bothwithintheDHTandacrossCoralCDNproxies. pointersreferenceothercachingproxieswithinthesamecluster.Ifa Togettheproxyaddressesassociatedwithakeyk,anode nodefindsamappinginitslocalcluster(afterstep2),itsgetfinishes. traverses the ID space with iterative RPCs, and it stops Otherwise,itcontinuesamongitslevel-1cluster(thesolidrings),and upon finding any remote peer storing values for k. This finally,ifneeded,toanynodewithinthegloballevel-0system. peer need not be the one closest to k (in terms of DHT identifier space distance). To put a key/value pair, Coral originserveraftertheCoralindexinglayerprovidesnore- routes to nodes successively closer to k and stops when ferralsornoneofitsreferralsreturnthedata. finding either (1) the nodes closest to k or (2) one that is CoralCDN’s inter-proxy transfers are optimized for lo- experiencinghighrequestratesforkandalreadyiscaching cality,bothfromtheiruseofparallelconnectionstoother severalcorrespondingvalues(withlonger-livedTTLs). It proxiesandbytheorderinwhichneighboringproxiesare stores the pair at the node closest to k that it managed to contacted. The properties of Coral’s hierarchical index- reach. TheseprocessespreventtreesaturationintheDHT. ingensuresthatthelistofproxiesreturnedbygetwillbe To improve locality, these routing operations are not sortedbasedontheirclusterdistancetotherequestinitia- initially performed across the entire global overlay. In- tor.Thus,proxieswillattempttocontactlevel-2neighbors stead, each Coral node belongs to several distinct routing beforelevel-1andlevel-0proxies,respectively. structurescalledclusters. Eachclusterischaracterizedby a maximum desired network round-trip-time (RTT). The 2.3.2 Rapidadaptationtoflashcrowds system is parameterized by a fixed hierarchy of clusters with different expected RTT thresholds. Coral’s deploy- Unlike many web proxies, CoralCDN is explicitly de- mentusesathree-levelhierarchy,withlevel-0denotingthe signedforflash-crowdscenarios.Ifaflashcrowdsuddenly global cluster and level-2 the most local one. Coral em- arrivesforawebobject,proxiesself-organizeintoaform ploysdistributedalgorithmstoformlocalized,stableclus- of multicast tree for retrieving the object. Data streams ters,whichwebrieflyreturntoin§5.3. from the proxies that started to fetch the object from the Every node belongs to one cluster at each level, as in originservertothosearrivinglater. Thislimitsconcurrent Figure2. Coralqueriesnodesinfastclustersbeforethose objectrequeststotheoriginserveruponaflashcrowd. in slower clusters. This both reduces lookup latency and CoralCDNprovidessuchbehaviorbycut-throughrout- increases the chance of returning values stored at nearby ing and optimistic references. First, CoralCDN’s use of nodes,whichcorrespondtoaddressesofnearbyproxies. cut-through routing at each proxy helps reduce transmis- siontimeforlargerfiles. Thatis,aproxywilluploadpor- 2.3 TheCoralCDNHTTPproxy tionsofaobjectassoonastheyaredownloaded,notwait- inguntilitreceivestheentireobject. Second,proxiesopti- CoralCDN seeks to aggressively minimize load on origin misticallyannouncethemselvesassourcesofcontent. As servers.ThissectionsummarizeshowitsproxiesuseCoral soonasaCoralCDNproxybeginsreceivingthefirstbytes forinter-proxycooperationandadaptationtoflashcrowds. ofawebobject—eitherfromtheoriginoranotherproxy— it inserts a reference to itself into Coral with a short TTL 2.3.1 Locality-optimizedinter-proxytransfers (30 seconds). It continually renews this short-lived refer- Each CoralCDN proxy keeps a local cache from which it enceuntileitheritcompletesthedownload(atwhichtime can immediately fulfill client requests. When a client re- itinsertsalonger-livedreference1)orthedownloadfails. questsanon-residentURL,CoralCDNproxiesattemptto 1Thedeployedsystemuses2-hourTTLsforsuccessfulresults(status fetchwebcontentfromeachother,usingtheCoralindex- codesof200,301,302,etc.),and15-minuteTTLsfor403,404,andother ing layer for discovery. A proxy only contacts a URL’s unsuccessful,non-transientresults. 3 ns) 100 FTroo Ump Cstlrieenatms Proxy/Origin Year dUomniaqiunes UUniRqLues w%ithU1RrLeqs pRoepquslatorUmRosLt o Milli 2005 7881 577K 54% 697K y ( 10 2007 21555 588K 59% 410K a er D 2009 20680 1787K 77% 1578K p s est 1 Figure5: CoralCDNtrafficstatisticsforanarbitraryday(Aug9). u q e R 0.1 depth. Figure 3 plots the total number of HTTP requests Jan’05 Jan’06 Jan’07 Jan’08 Jan’09 Jan’10 thatthesystemreceivedeachdayfrommid-2004through Figure3:TotalHTTPrequestsperdayduringCoralCDN’sdeploy- early 2010, showing both the number of HTTP requests ment.Grayedregionscorrespondtomissingorincompletedata. from clients, as well as the number of requests issued to #IPs per Day (Millions) 01 ..01255 2005 2007 2009 Data Sent per Day (GB) 11223 505050000000 0000000 2005 2007 2009 ucmwdopeiempstWnhttrmtheem,baoeeoemxnartacwemrCheeroeiqecnrcunoaeeenldn5sCstttiDhsarrtarnaNietndteeegpss2teoio0meffforeHms4r0poTiml–erTlr5uioioP0cornihdtgmrsHiaonifflfTfirlsioTcCoimtnPeoosvrtd.raheaelTerCiqslhtuyeDheeelrNstoerstgq’saassucmpeedienessertpsnmshd.li2onooayerwye--, Figure 4: CoralCDNusage: numberofuniqueclients(left)and dayperiod(August9–18)in2005,2007,and2009. Coral- uploadvolume(right)foreachdayduringAugust9–18. CDNreceived15–25Mrequestsduringeachdayofthese periods.Figure4plotsthetotalnumberofuniqueclientIP addresses from which these requests originated (left) and 2.4 Implementationanddeployment theaggregateamountofbandwidthuploaded(right). The CoralCDNiscomposedofthreestand-aloneapplications. traces showed 1–2 million clients per day, resulting in a TheCoraldaemonprovidesthedistributedindexinglayer, fewterabytesofcontenttransferred. Wewillprimarilyuse accessedoverUNIXdomainsocketsfromasimpleclient the2009trace,consistingof209Mrequests,inlateranal- librarylinkedintoapplicationssuchasCoralCDN’sHTTP ysis. Figure5providesmoreinformationaboutthetraffic proxyandDNSserver. Allthreearewrittenfromscratch. patterns,focusingonthefirstdayofeachtrace. Coral network communication uses Sun RPC over UDP, Figure 6 plots the distribution of requests per unique while CoralCDN proxies transfer content via standard URL.WeseethatthenumberofrequestsperURLfollows HTTP connections. At initial publication [14], the Coral a Zipf-like distribution, as common among web caching daemon was about 14,000 lines of C++, the DNS server andproxynetworks[5]. CertainURLsareverypopular— 2,000 LOC, and the proxy 4,000 LOC. CoralCDN’s im- theso-called“head”ofthedistribution—suchasthemost plementationhassincegrowntoaround50,000LOC.The popular one in the Aug-9-2009 trace, which received al- changeswelaterdiscusshelpaccountforthisincrease. most1.6Mrequestsitself. AlargenumberofURLs—the CoralCDNtypicallyrunson300–400PlanetLabservers distribution’s“heavytail”—receiveonlyasinglerequest. (about 70–100 of which run its DNS server), spread over The datasets also show stability in the most popular 100-200 sites worldwide. It avoids Internet2-only and URLs and domains over time. In all three datasets, the commercialsites,thelatterduetopolicydecisionsthatre- most popular URL retained that ranking across all nine stricttheiruseforopenservices. CoralCDNusesnospe- days. In fact, this URL in the 2007 and 2009 traces be- cialknowledgeofthesemachines’locationsorconnectiv- longedtothesamedomain: asitethatusesCoralCDNto ity(e.g.,GPScoordinates,routinginformation,etc.).Even distributerule-setupdatesforthepopularFirefoxAdBlock though CoralCDN runs on a centrally-managed testbed, browser extension. Exploring this further, Figure 7 uses its mechanisms remain decentralized and self-organizing. the2009tracetoplottherequestrateperdayforthemost The only use of centralization is for managing software populardomains(takingtheunionofeachday’smostpop- andconfigurationupdatesandforcontrollingrunstatus. ularfivedomainsresultedinnineuniquedomains).Wesee that six of the nine domains had stable traffic patterns— 3 Analyzing CoralCDN’s Usage theywerelong-termCoralCDN“customers”—whilethree varied between two and six orders of magnitude per day. This section presents some HTTP-level data from Coral- The traffic patterns that we see in these two figures have CDN’sdeploymentandconsidersitsimplications. designimplications,whichwediscussnext. 3.1 Systemtracesandtrafficpatterns 2Thepeakof120MrequestsonAugust21, 2008correspondstoa To understand some of the HTTP traffic patterns that short-livedexperimentofanacademicresearchprojectusingCoralCDN CoralCDNsees,weanalyzedseveraldatasetsinincreasing asakey-valuestore[15]. 4 1e+06 TopURLs TotalSize(MB) %ofTotalReqs URL 100000 AAuugg--99--22000057 0.01% 14 49.1% er 10000 Aug-9-2009 0.1% 157 71.8% p ests 1 100000 1% 3744 84.8% equ 10 10% 28734 92.2% R 1 1 10 100 1000 10000 100000 1e+06 Figure8: CoralCDN’sworkingsetsizeforitsmostpopularURLs Unique URLs by Popularity on Aug 9, 2009: A small percentage of URLs account for a large Figure6: TotalrequestsperuniqueURL. fractionofrequests,yettheyrequirerelativelylittlestoragetocache. n 1e+07 mai 1e+06 Insufficientforresurrectingoldcontent. CoralCDNis er do 1 1000000000 not designed for archival storage. Proxies do not proac- s p 1000 tively replicate content for durability, and unpopular con- quest 1 1000 tent is evicted from proxy caches over time. Further, if Re 1 content has an expiry time (default is 12 hours), a proxy 1 2 3 4 5 6 7 8 9 will serve expired content for at most 24 hours after the Time (Days) origin fails. Still, some clients attempt to use Coral- Figure7: Requestspertop-5domainovertime(Aug9-18,2009). CDN for this purpose. This underscores a design trade- 3.2 Implicationsofusagescenarios off: Instressingsupportforflashcrowdsratherthanlong- termdurability,CoralCDNdevotesitsresourcestoprovide ForCoralCDNtohelpunder-provisionedwebsitessurvive availability for content being actively requested. On the unexpectedtrafficspikes,itdoesnotrequireanypriorreg- otherhand,byservingexpiredcontentforalimiteddura- istrationorauthorization. Yetwhilesuchopennessisnec- tion,CoralCDNcanmaskthetemporaryunavailabilityof essarytoenableevenunmanagedwebsitestosurviveflash anorigin,atleastforcontentalreadycachedinitsnetwork. crowds,itcomesatacost: CoralCDNisusedinavariety ofwaysthatdifferfromthismorenarrowgoal. Thissec- Unnecessary for unpopular content. While proxies tion considers how well CoralCDN’s design is suited for candiscoverevenrarecachedcontent,CoralCDNdoesnot itsfourmainusagescenarios: provideanybenefitbyservingsuchunpopularcontent: It does not reduce servers’ load meaningfully, and it often 1. Resurrectingoldcontent: Anecdotally,someclients results in higher client latency. As such, clients that use attempt to use CoralCDN for long-term durability. CoralCDNtoavoidlocalfiltering,circumventgeographic One can download browser plugins that link to both restrictions,orprovide(minimal)anonymitymaybebetter CoralCDN and archive.org as potential sources servedbystandardopenproxies(thatvanillabrowserscan ofcontentwhenoriginserversareunavailable. beconfiguredtouse)orthroughspecializedtoolssuchas 2. Accessing unpopular content: CoralCDN’s request Tor[12]. Yet,thistypeofusagepersists—thelongtailof distribution shows a heavy tail of unpopular URLs. Figure6—andCoralCDNmightthenbebetterservedwith ServersmayCoralizeURLsthatfewvisit. Andsome adifferentdesignforsuchtraffic,i.e.,onethatdoesn’tre- clients use CoralCDN as a more traditional proxy, quireamulti-hop,wide-areaDHTlookuptocompletebe- for(presumed)anonymity,censorshiporfilteringcir- forefetchingcontentfromtheorigin. Forexample,forits cumvention[32],orautomatedcrawling. modest deployment on PlanetLab, each Coral node could maintainconnectivitytoallothersandsimplyuseconsis- 3. Serving long-term popular content: Most requests tent hashing for a global, one-hop DHT [17, 37]. Alter- areforasmallsetofpopularobjects. Theseobjects, natively, Coral could only maintain connections with re- alreadywidelycachedacrossthenetwork, belongto gional peers and eschew global lookups, a design which thestablesetofcustomerdomainsthateffectivelyuse weevaluatefurtherin§7. CoralCDNasafree,long-termCDNprovider. 4. Surviving flash crowds to content: Finally, Coral- Overkill for stably popular content, so far. For most CDNisusedforitsstatedgoalofenablingunderpro- ofCoralCDN’straffic,cooperationisnotneeded: Figure6 visioned websites to withstand transient load spikes. shows that a small number of URLs accounts for a large PopularportalsregularlylinktoCoralizedURLs,and fraction of requests. We now measure their working set userspostlinksincomments. Somesitesevenadopt sizeinFigure8,inordertodeterminehowmuchstorageis dynamic and programmatic mechanisms to redirect requiredtohandlethistraffic. Wefindthatthemostpopu- requests to CoralCDN, based on observed load and lar0.01%ofURLsaccountformorethan49%ofthetotal requestreferrers. Wediscussthisfurtherin§4.1. requeststoCoralCDN,yetrequireonly14MBofstorage. Each proxy has a 3.0 GB disk cache, managed using an Unfortunately, CoralCDN’s design is not well-suited for LRU eviction policy. This is sufficient for serving nearly thefirstthreeusecases. 85%ofallrequestsfromlocalcache. 5 18000 70.4%hitinlocalcache slashdot.org referer 16000 10-May-05 12.6%returned4xxor5xxerrorcode 14000 09:15 EST 11:56 97..91%%ffeettcchheeddffrroommoorthigeirnCsoitrealCDNproxy minute 12000 |→1.7% fromlevel-0cluster(global) er 10000 p |→1.9% fromlevel-1cluster(regional) sts 8000 e |→3.6% fromlevel-2cluster(local) qu 6000 e R 4000 Figure9: CoralCDNaccessratiosforcontentduringAug9,2009. 2000 0 These workload distributions support one aspect of -30 0 30 60 90 120 150 180 CoralCDN’s design: Content should be locally cached Time (minutes) by the “forward” CoralCDN proxy directly serving end- Figure10: FlashcrowdtoaCoralizedURLlinkedtobySlashdot. clients, given that small to moderate size caches in these proxies can serve a very large fraction of requests. This 600 moonbuggy.org redditmirror.cc moonbuggy reddit differs from the traditional DHT approach of just storing 500 1000 e data on a small number of globally-selected proxies, so- ut called“serversurrogates”[8,37]. er min 400 100 IfCoralCDN’sworkingsetcanbefullycachedbyeach s p 300 10 st node,weshouldunderstandhowmuchcooperationisac- que 200 1 e tually needed. Figure 9 summarizes the extent to which R proxies cooperate when handling requests. 70% of re- 100 queststoproxiesaresatisfiedlocally,whileonly7%result 0 incooperativetransfers. (Thehighrateoferrormessages 0 24 48 72 96 120 144 168 Time (Hours) isduetoadmissioncontrolasameansofbandwidthman- Figure11: Mini-flashcrowdsduringAugust2009trace.Eachdat- agement,whichwediscussin§5.2.)Inshort,atleastforits apointrepresentsaone-minuteduration;embeddedsubfiguresshow currentworkloadandenvironment,onlyasmallfractionof requestratesforthetensofminutesaroundtheonsetofflashcrowds. CoralCDN’strafficusesitscooperationmechanisms. Arelatedresultaboutthelimitsofcooperativecaching long leading edge before traffic rises by several orders of hadbeenobservedearlier[38],butfromtheperspectiveof magnitude,whichhasinterestingimplications. limitedimprovementsinclient-sidehitrates. Thisisasig- Figures 10 and 11 show the request patterns of several nificantlydifferentgoalfromreducingserver-siderequest flashcrowdsthatCoralCDNexperienced. Theformerwas rates, however: Non-cooperating groups of nodes would toasitelinkedtoinaSlashdotarticleinMay2005. After eachindividuallyrequestcontentfromtheorigin. rising,theSlashdotflashcrowdlastedlessthanthreehours This design trade-off comes down to the question of in duration and came to an abrupt conclusion (perhaps as how much traffic is too much for origin servers. For thestorydroppedoffthewebsite’smainpage). Thelatter, moderately-provisioned origins, such as the customers of covering our August 2009 trace, shows spikes to the im- commercial CDNs, a caching system might only rely on agecacheofalesspopularportal(moonbuggy.org),as local or regional cooperation. In fact, Akamai’s network wellastoawell-publicizedmirrorforthecollaboratively- is designed precisely so: Nodes within each of its ap- filtered reddit.com, with another attenuated spike 24 proximately1000clusterscooperate,buteachclustertypi- hourslater. TheembeddedgraphsinFigure11depictthe callyfetchescontentindependentlyfromoriginsites[22]. requestratesaroundtheonsetofthetrafficspikeforanar- To replicate such scenarios, Coral’s clustering algorithms rowerrangeoftime. Allthreeflashcrowdsshowthatthe could be used to self-organize a network into local or re- initialrisetookminutes. gionalclusters. Itcouldthusavoidthemanualconfigura- Foramorequantitativeanalysisofthefrequencyofflash tionofHarvest[7]orcolocateddeploymentsofAkamai. crowds, we examined the prevalence of domains that ex- Ontheotherhand, whilecooperationisnotneededfor perience a large increase in their request rates from one most traffic, CoralCDN’s ability to react quickly to flash time period to the next. In particular, Figure 12 consid- crowds—tooffloadtrafficfromafailingoroversubscribed ers all five-second periods across the August 2009 ten- origin—ispreciselythescenarioforwhichitwasdesigned day trace. The left graph plots a complementary cumu- (andcommercialCDNsarenot). Weconsiderthesenext. lative distribution function (CCDF) of the percentage of Usefulformitigatingflashcrowds. CoralCDN’straces domainsrequestedineachperiodthatexperiencea10-or regularly show spikes in requests to different URLs. We 100-fold rate increase. The right graph plots the percent- find, however, that these flash crowds grow in popularity age of requests accounted for by these domains that ex- ontheorderofminutes,notseconds.Thereisasufficiently perience orders-of-magnitude (OOM) increases. Sudden 6 Percentage of Epochs 0 .01 0 0.1000 .100111 ≥≥ 512s OO eOOpMMochs Percentage of Epochs 0 .01 0 0.1000 .100111 ≥≥ 512s OO eOOpMMochs owcraofitutIeChnniiotnnssrchafsroolehCrraotD,sarettsNshmti’ivssmaeldredlyaofptrrmaaeacrrasietihoinlooydsnwso,eoscxf(ctp2tuhhe)rarettiohetn(ono1tcst)aheeloerdnleoaolqrrymdgueeeaarsistrnosamsf,t’eaasnletlricdnafofc(rfin3aredcc)astaisa.onecyns- 1 10 1 10 100 % Domains that changed by OOM % Requests that changed by OOM Thismoderateadoptionrateavoidstheneedtointroduce Figure 12: CCDF of extent of flash-crowd dynamics in August evenmoreaggressivecontentdiscoveryalgorithms. Sim- 2009trace.Leftgraphshowspercentageofdomainsexperiencingor- ulated workloads in early experiments (Figure 4 of [14]) dersofmagnitude(OOM)changesinrequestratesacrossfive-second showedthatunderhighconcurrency,CoralCDNmightis- epochs.Rightshows%requestsforwhichthesedomainsaccount. sue several redundant fetches to an origin server due to a race-like condition in its lookup protocol. If multiple Percentage of Epochs 1 24680 000000 ≥≥3 210 sOO eOOpMMochs Percentage of Epochs 1 24680 000000 ≥≥≥1 3210 mOOO OOOepMMMochs ninbsooytddimneesstohccseoatnniancpcduopernlxriet,canaactltltliyotchngoesentcowturhihrgerieicsnnha.tmTuloeshoekikseauyrpadwcsiehcstaicrcnoihbnfudadtioieltediaosnnhndaoimssthysuheltttaairebpexlldee- 0.01 0.1 1 10 100 0.01 0.1 1 10 100 (both peer-to-peer and datacenter services). But because % Requests that changed by OOM % Requests that changed by OOM thesetracesshowthatthearrivalofuserrequestshappens Epochs 1 8000 Epochs 1 8000 over a much longer time-scale than a DHT lookup, this Percentage of 246 0000 ≥≥≥≥ 4321 6OOOOhOOOO eMMMMpochs Percentage of 246 0000 ≥≥≥≥1 4321d OOOOeOOOOpoMMMMchs rdaecsNeigocntoeinntdghiataitoinnteidstwopeoosrsknsiobfitllepeotsosyemsateitsmiigganftoiefirtchPainlsatcnpoerntoLdbailtbeiomtnh..aWt shuiple- 0.01 0.1 1 10 100 0.01 0.1 1 10 100 ported cooperative caching [2]—meant to quickly dis- % Requests that changed by OOM % Requests that changed by OOM Figure 13: CDFsofpercentageofrequestsaccountedforbydo- tribute a file in preparation for a new experiment—we soughttominimizeredundantfetchestothefileserver.We mains experiencing order(s)-of-magnitude rate increases. Rate in- extended Coral’s insert operation to provide return status creasescomputedacrossepochsof30seconds(topleft),10minutes information, like test-and-set in shared-memory systems. (topright),sixhours(bottomleft),andoneday(bottomright). Plots A single put+get both returns the first values it encoun- startonthey-axiswithzerodomainshavingsuchanincrease,e.g., tered in the DHT, as well as inserts its own values at an 28%of30sepochshavenodomainswitha≥1OOMrateincrease. appropriate location (for a new key, this would be at its closest node). This optimization comes at a subtle cost, increasesdoexist,buttheyarerare. In76%of5sepochs, however, asitnowoptimisticallyinsertsanode’sidentity nodomainsexperiencedany10-foldincrease,whilein1% evenbeforethatproxybeginsdownloadingthefile! Ifthe ofepochs, 1.7%ofdomains(accountingfor12.9%ofre- originfetchfails—agreaterpossibilityinCoralCDN’sen- quests) increased by one order-of-magnitude. Larger dy- vironmentthanwithamanagedfileserver—thentheuseof namismwasevenmorerare:onlyin0.006%ofepochsdid theseindexentriesdegradesperformance. Thus, afterus- there exist a domain that experienced a 100-fold increase ingthisput+getprotocolinCoralCDNforseveralmonths inrequestrate. NothreeOOMincreaseoccurred. during2005,wediscontinueditsuse. To further understand the precipitousness of “flash” crowds, Figure 13 extends this analysis across longer du- CoralCDN’sopennesspermitsuserstoquicklyleverage rations.3 Among30sepochs,50%ofepochshaveatmost its resources under load, and its more complex coordina- 0.4% of domains experience a 10-fold increase in their tionhelpsmitigatetheseflashcrowdsandmasktemporary rates (not shown), which account for a total of 1.0% of serverunavailability. Yetthisveryopennessledtovaried requests (top left). Only 0.29% of 30s epochs have any usage,themajorityofwhichdoesnotrequireCoralCDN’s domains with more than a 100-fold rate increase. At 10- morecomplexdesign. Aswewillsee, thisopennessalso minute epochs, 28% of epochs have at least one domain introducesotherproblems. that experiences a two OOM rate increase, while 0.21% have a domain with a three OOM increase. Still, these 4 Lessons for the Web flashcrowdsaccountforasmallfractionoftotalrequests: Domainsexperiencing100-foldincreasesaccountedforat CoralCDN’s naming technique provides an open API for least1%ofallrequestsinonly3.8%of10mepochs, and CDN services that can transparently work for almost any 10%ofrequestsin0.05%ofepochs. website. Over the course of its deployment, clients and servershaveusedthisAPItoadoptCoralCDNasanelas- 3Toavoidovercountingunpopulardomains,wedonotcountchanges ticresourceforcontentdistribution. Throughcompletely whentheabsolutenumberofrequeststoadomaininagiventimeperiod automatedmeans,workcanbedynamicallyexpandedout islessthansomeminimumamount,i.e.,10requestsfor5s,30s,and10m periods,and100requestsfor6hand1dperiods. to use CoralCDN when websites require additional band- 7 widthresources,anditcanbecontractedbackwhenflash An API for dynamic adoption. CoralCDN was envi- crowds abate. In doing so, its use presaged the notion of sioned with manual URL manipulation in mind, whether “surgecomputing”withpubliccloudplatforms. Butthese by publishers editing HTML, users typing Coralized namingtechniquesandCoralCDN’sopendesignintroduce URLs, or third-parties posting links. After deployment, anumberofwebsecurityproblems,manyofwhichareen- however, users soon began treating CoralCDN’s interface genderedbyalackofexplicitnessforspecifyingprotection asanAPIforaccessingCDNservices. domains. Wediscusstheseissueshere. On the client side, these techniques included simple browserextensionsthatoffer“right-click”optionstoCor- 4.1 AnAPIforelasticCDNservices alizelinksorthatprovidealinkwhenapageappearsun- We believe that the central reason for CoralCDN’s adop- available. They ranged to more complex integration into tionhasbeenitssimpleuserinterfaceandopendesign. frameworks like Firefox’s Greasemonkey [21]. Grease- monkeyallowsthird-partydeveloperstowritesite-specific Interface design. While superficially obvious, Coral- javascriptcodethat,onceinstalledbyusers,manipulatesa CDN’sinterfacedesignachievesseveralimportantgoals: site’sHTMLcontent(usuallythroughtheDOMinterface) • Transparency: Workwithunmodified,unconfigured, whenever the user accesses it. Greasemonkey scripts for andunawarewebclientsandwebservers. CoralCDN include those that automatically rewrite links • Deep caching: Retrieve embedded images or links onpopularportals,ormodifyarticlestoincludetooltipsor automaticallythroughCoralCDNwhenappropriate. additional links to Coralized URLs. CoralCDN also has been integrated directly into a number of client-side soft- • Servercontrol:Notinterferewithsites’abilitytoper- warepackagesforpodcasting. form usage logging or otherwise control how their ThemoreinterestingcasesofCoralCDNintegrationare contentisserved(e.g.,viaCoralCDNordirectly). ontheserver-side. Onecommonstrategyisfortheorigin • Ad-friendly: Not interfere with third-party advertis- toreceivetheinitialrequest, butrespondwitha302redi- ing,analytics,orothertoolsincorporatedintoasite. recttoaCoralizedURL.Thiscanworkwellevenforflash • Forwardcompatible: Beamenabletofutureend-to- crowds, astheoverheadofgeneratingredirectsismodest endsecuritymechanismsforcontentintegrityorother comparedtothatofactuallyservingthecontent. end-hostdeployedmechanisms. Generating such redirects can be done by installing a serverpluginandwritingafewlinesofconfigurationcode. Consider an alternative and even simpler interface de- Forexample,thecompletedynamicredirectionruleusing sign [11, 25, 29], in which one embeds origin URLs into Apache’smod_rewritepluginisasfollows. the HTTP path, e.g., http://nyud.net/example. com/.NotonlyisHTTPparsingsimpler,butnameservers RewriteEngine on RewriteCond %{HTTP_USER_AGENT} !^CoralWebPrx wouldnotneedtosynthesizeDNSrecordsonthefly(un- RewriteCond %{QUERY_STRING} !(^|&)coral-no-serve$ like our DNS servers for *.nyud.net). Unfortunately, RewriteRule ^(.*)$ http://%{HTTP_HOST}.nyud.net %{REQUEST_URI} [R,L] whilethisinterfacecanbeusedtodistributeindividualob- jects,itfailsonentirewebpages. Anyrelativelinkswould Still,redirectionrulesmustbecraftedcarefully. Inthis lacktheexample.comprefixthataproxyneedstoiden- example, the second line checks whether the client is a tify its origin. One alternative might be to try to rewrite CoralCDNproxyandthusshouldbeserveddirectly. Oth- pages to add such links, although active content such as erwise,aredirectionlooppotentiallycouldbeformed(al- javascript makes this notoriously difficult. Further, such though proxies prevent this from happening by checking active rewriting impedes a site’s control over its content, forpotentialloopsandreturningerrorsifoneisfound). anditcaninterferewithanalyticsandadvertisements. Amusingly, some early users during CoralCDN’s de- CoralCDN’sapproach,however,interpretsrelativelinks ploymentcausedrecursioninadifferentway—andaform with respect to a page’s Coralized hostname, and thus ofamplification attack—bysubmitting URLswitha long transparently requests these objects through it as well. string of nyud.net’s appended to a domain. Before But all absolute URLs continue to point to their origin proxies checked for such conditions, this single request sites, andthird-partyadvertisementsandanalyticsremain caused a proxy to issue a number of requests, stripping largely unaffected. Further, as CoralCDN does not mod- thelastinstanceofnyud.netoffineachiteration. ify content, content also may be amenable to verification While the above rewriting rule applies for all requests, throughend-to-endcontentsignatures[30,35]. othersitesincorporateredirectioninmoreinventiveways, In short, it was important for adoption that site owners such as only redirecting clients arriving from particular retainsufficientcontroloverhowtheircontentisdisplayed high-trafficreferrers: andaccessed. Infact,ourpredictedusagescenarioofsites publishingCoralizedURLsprovedtobelesspopularthan RewriteCond %{HTTP_REFERER} slashdot\.org [NC,OR] RewriteCond %{HTTP_REFERER} digg\.com [NC,OR] thatofdynamicredirection(whichwedidnotforesee). RewriteCond %{HTTP_REFERER} blogspot\.com [NC] 8 And most interestingly, some sites have even combined whichwediscussfurtherin§5.2.Inadditiontomonitoring suchtoolswithserverpluginsthatmonitorserverloadand server-sideconsumption,proxieskeepaslidingwindowof bandwidthuse,sothattheirserversonlystartrewritingre- client-side usage. Not only do we seek to prevent exces- questsunderhighloadconditions. sivebandwidthconsumptionbyclients,butalsoanexces- WebsitesthereforeusedCoralCDN’snamingtechnique sive number of (even small) requests. These are caused toleverageitsCDNresourcesinanelasticfashion. Based typically by server misconfigurations that result in HTTP on feedback from users, we expanded this “API” to give redirection loops (per §4.1) or by “bot” misuse as part of sitessomesimplecontroloverhowCoralCDNshouldhan- abrute-forceattack. WhileCoralCDN’slimitedfunction- dle their requests. For example, webservers can include alitymitigatessuchattacks, onenotablebrute-forcelogin X-Coral-Control response headers, which are saved attempttookadvantageofpoorsecurityatatop-5website, ascachemeta-data,tospecifywhetherCoralCDNproxies whichusedcleartextpasswordsoverGETrequests. should “redirect home” domains that exceed their band- Givenbothitsstorageandbandwidthlimitations,Coral- widthlimits(per§5.2)orjustreturnanerrorasisstandard. CDN enforces a maximum file size of 50 MB. This hasgenerallypreventedclientsfromusingCoralCDNfor 4.2 Securityandresourceprotection videodistribution,apragmaticgoalwhendeployingprox- Anumberofsecuritymechanismscurtailedthemisuseof ies on university-hosted PlanetLab servers. We found CoralCDN.Wehighlightthedesignprincipleforeach. that sites attempted to circumvent these limits by omit- tingContent-Lengthheaders(onconnectionsmarked 4.2.1 Limitingfunctionality as persistent and without chunked encoding). To ensure compliance, proxies now monitor ongoing transfers and CoralCDN proxies have only ever supported GET and halt(andblacklist)anyonesthatexceedtheirlimits. This HEAD requests. Many of the attacks for which “open” skepticism is needed as proxies interact with potentially proxiesareinfamous[24]aresimplynotfeasible. Forex- untrustedservers,andthusmustenforcecompletemedia- ample, clients cannot use CoralCDN to POST passwords tion[33]totheirresources(inthiscase,bandwidth). for brute-force cracking. Proxies do not support CON- NECTrequests,andthustheycannotbeusedtosendspam 4.2.3 Blacklistingdomainsandoffloadingsecurity asSMTPrelaysortoforge“From”addressesinwebmail. Proxies do not support HTTPS and they delete all HTTP Wemaintainaglobalblacklistforblockingaccesstospec- cookiessentinheaders. Theseproxiesthusprovidemini- ified origin domain names. Each proxy regularly fetches malapplicationfunctionalityneededtoachievetheirgoals, and reloads the blacklist. This is a practical, but not fun- whichiscooperativelyservingcacheablecontent. damental,necessity,employedtopreventCoralCDN’sde- CoralCDN’s design had several unexpected conse- ploymentsitesfromrestrictingitsuse. Partiesthatrequest quences. Perhaps most interestingly, given CoralCDN’s blacklistingtypicallyciteoneofthefollowingreasons. multi-layer caching architecture, attempting to crawl or Suspectedphishing. Websiteshavebeenconcernedthat brute-force attack a website via CoralCDN is quite slow. CoralCDNis—orwillbeconfusedwith—aphishingsite. New or randomly-selected URLs first require a DHT Afterall,bothappeartobe“scraping”contentandpublish lookup to fail, which serves to delay requests against an a simulacrum under an alternate domain. The difference, originwebsite,inmuchthesamewaythatssh“tarpits”de- of course, is that CoralCDN is serving the site’s content layresponsestofailedloginattempts. Inaddition,because unmodified,yettheweblacksanyprotocoltoauthenticate CoralCDNonlyhandlesexplicitCoralizedURLs,itcannot theintegrityofcontent(asinS-HTTP[30])inordertover- be used by simply configuring a vanilla browser’s proxy ifythis. AsSSLonlyauthenticatesidentity,websitesmust settings. Further, CoralCDN cannot be used to anony- typicallyincludeCDNsintheirtrustedcomputingbase. mously launch attacks, as it eschews anonymity. Proxies useuniqueUser-Agentstrings(“CoralWebPrx”)and Potential copyright violation. Typically following a include their identity in Via headers, and they report an DMCA take-down notice, third-parties report that copy- instigating client’s IP address to the origin server (in an rightedmaterialmaybefoundonaCoralizeddomainand X-Forwarded-For request header). We can only sur- wantitblocked.ThisscenarioismitigatedbyCoralCDN’s mise whether the combination of these properties played explicit naming—which preserves the name of the actual some role, but CoralCDN has seen little abuse as a plat- origininquestion—andbyitscachingdesign. Oncecon- formforproxyingserverattacks. tent is removed from an origin server, it is evicted auto- matically from CoralCDN in at most 24 hours. This is a 4.2.2 Curtailingexcessiveresourceuse natural implication of its goal of handling flash crowds, ratherthanprovidinglong-termavailability. CoralCDN’s major limiting resource is aggregate band- width. The system employs fair-sharing mechanisms to Circumventing access-control restrictions. Some do- balancebandwidthconsumptionbetweenorigindomains, mainsmediateaccesstotheirwebsiteviaIP-basedauthen- 9 tication, whereby requests from particular IP prefixes are browserstate.Thispolicyappliestomanipulatingcookies, grantedaccess.Thispracticeisespeciallycommonforon- browser windows, frames, and documents, as well as to lineacademicjournals,inordertoprovideeasyaccessfor accessingotherURLsviaanXmlHttpRequest. Atitssim- universitysubscribers.Butopenproxieswithinwhitelisted plestlevel,allofthesebehaviorsareonlyallowedbetween prefixeswouldenableexternalclientstocircumventthese resources that belong to an identical origin domain. This access-controlrestrictions. provides security against sites accessing each others’ pri- By offloading policing to their customers, sites unnec- vateinformationkeptincookies,forexample. Italsopre- essarily enlarge their security perimeter to include their vents websites that run advertisements (such as Google’s customer’snetworks. Thisscenarioiscommonyetunnec- AdSense)fromeasilyperformingclickfraudtopaythem- essary. Recall that CoralCDN proxies do not hide their selvesadvertisingdollarsbyprogrammatically“clicking” identities, and they include the originating client’s IP ad- ontheirsite’sadvertisements.6 dressinstandardrequestheaders.Thus,originsitescanre- One caveat to the strict definition of an identical ori- tainIP-basedauthenticationwhileverifyingthatarequest gin [18] is that it provides an exception for domains does not originate from outside allowed prefixes.4 Sites that share the same domain.tld suffix, in that www. arejustnotmakinguseofthisinformation,andthusfailto example.comcanreadandsetcookiesforexample. properlymediateaccesstotheirprotectedresources.5 com. This has bad implications for CoralCDN’s naming We did encounter some interesting attacks on our strategy. When example.com is accessed via Coral- domain-based blacklists, akin to fast-flux networks. An CDN, it can manipulate all nyud.net cookies, not just adversarycreateddynamicDNSrecordsforarandomdo- those restricted to example.com.nyud.net.7 Con- mainthatpointedtotheIPaddressofatargetdomain(an cernedwiththepotentialprivacyviolationsfromthissce- online academic journal). The random domain naturally nario,CoralCDNdeletesallcookiesfromheaders. was not blacklisted by CoralCDN, and the content was Unfortunately,manywebsitesnowmanagecookiesvia successfully downloaded from the origin target. Such a javascript, so cookie information can still “leak” between circumventiontechniquewouldnothaveworkediftheori- Coralized domains on the browser. This happens of- gin site checked either proxy headers (as above) or even ten without a site’s knowledge, as sites commonly use a just the Host field of the HTTP request. The Host cor- URL’s domain.tld without verifying its name. Thus, responded to the fast-flux attack domain, not that of the iftheCoralizedexample.comwritesnyud.netcook- journal. Again, this security hole demonstrates a lack of ies, these will be sent to evil.com.nyud.net if the explicitverificationandfail-safedefaults[33]. clientvisitsthatwebpage. HonestCoralCDNproxieswill delete these cookies in transit, but attackers can still cir- 4.3 Securityandnamingconflation cumvent this problem. For example, when a client vis- We argued that CoralCDN’s naming provided a powerful itsevil.com.nyud.net,javascriptfromthatpagecan API for accessing CDN services. Unfortunately, its tech- accessnyud.netcookies,thenissueaXmlHttpRequest nique has serious implications as the Web’s Same Origin backtoevil.com.nyud.netwithcookieinformation Policy(SOP)conflatesnamingwithsecurity. embeddedintheURL.Similarattacksarepossibleagainst Browsersusedomainnamesforthreepurposes. (1)Do- other uses of the SOP, especially as it relates to the abil- mains specify where to retrieve content after they are re- itytoaccessandmanipulatetheDOM.Notethattheseat- solved to IP addresses, precisely how CoralCDN enacts tackvectorsexistevenwhileCoralCDNoperatesonfully- its layer of indirection. (2) Domains provide a human- trustednodes,letalonemorepeer-to-peerenvironments! readable name for what administrative entity a client is RatherthanconcludethatCoralCDN’sdomainmanipu- interacting with (e.g., the “common name” identified in lationisfundamentallyflawed,wearguethatbetteradher- SSLservercertificates).(3)Domainsspecifywhatsecurity encetosecurityprinciplesisneeded.Websitesarepartially policiestoenforceonwebobjectsandtheirinteractions. atfaultbecausetheydefaultaccesstodomain.tldsuf- The Same Origin Policy specifies how scripts and in- fixestooreadily,asopposedtostrippingtheminimalnum- structions from an origin domain can access and modify berofdomainprefixes: aviolationoftheprincipleofleast information. An alternative solution that embraces least 4Thisdoesnotaddressthecornercaseinwhichtheoriginalrequest comesfromanIPaddresswithinthatprefix,whilesubsequentonesthat 6ThisispreventedbecauseadvertisementslikeAdSenseloadinan accessthethen-cachedcontentdonot. Thiscanbehandledtypicallyby iframethattheparentdocument—thethird-partywebsitethatstandsto markingcontentasnotcacheable,orbyhavingaproxyincludeheaders gainrevenue—cannotaccess,astheframebelongstoadifferentdomain. thatexplicitlyspecifyitsclientpopulation(i.e.,as“open”orbyIPprefix). 7CommercialCDNslikeAkamaiaretypicallynotsusceptibletosuch 5One might argue that sites use a pure IP-based filtering approach attacks,astheygenerallyuseaseparatetop-leveldomainsforeachcus- givenitsabilitytobeimplementedinlayer-3front-endloadbalancers. tomer,asopposedtoCoralCDN’ssuffix-basedapproach. UnlikeCoral- Butthisisnotasimplefirewallproblem,assitesalsopermitaccessfor CDN’s zero configuration, however, such designs require that origins individualusersthatloginwiththeappropriatecredentials.Thesiteswith preestablish an operational relationship with their CDN provider and whichwecommunicatedimplementedsuchauthorizationlogiceitherdi- point their domain to the CDN service (e.g., by aliasing their domain rectlyinwebserversorincomplex,layer-7front-endappliances. totheCDNthroughCNAMErecordsinDNS). 10
Description: