Secure, User-level Resource-constrained Sandboxing FangzheChang,AyalItzkovitz,andVijayKaramcheti DepartmentofComputerScience CourantInstituteofMathematicalSciences NewYorkUniversity (cid:0) fangzhe,ayali,vijayk @cs.nyu.edu (cid:1) Abstract The popularityof mobileand networkedapplicationshas resultedin an increasingdemand forex- ecution“sandboxes”—environmentsthatimposeirrevocablequalitativeandquantitativerestrictionson resourceusage.Existingapproacheseitherverifyapplicationcompliancetorestrictionsatstarttime(e.g., usingcertifiedcodeorlanguage-basedprotection)or enforceitatruntime(e.g., usingkernelsupport, binary modification,or activeinterceptionofthe application’sinteractionswith the operatingsystem). However,theirgeneralapplicabilityisconstrainedbythefactthattheyareeithertooheavyweightand inflexible,orarelimitedinthekindsofsandboxingrestrictionsandapplicationstheycanhandle. Thispaperpresentsasecureuser-levelsandboxingapproachforenforcingbothqualitativeandquan- titativerestrictionsonresourceusageofapplicationsindistributedsystems.Ourapproachactivelymon- itors an application’s interactions with the underlying system, proactively controlling it as desired to enforcethedesiredbehavior.Ourapproachleveragesacoresetofuser-levelmechanismsthatareavail- able in most modern operating systems: fine-grained timers, monitoring infrastructure (e.g., the /proc filesystem),debuggerprocesses,priority-basedscheduling,andpage-basedmemoryprotection. Wede- scribeimplementationsofasandboxthatimposesquantitativerestrictionsonCPU,memory,andnetwork usageontwocommodityoperatingsystems:WindowsNTandLinux.Ourresultsshowthatapplication usageofresourcescanberestrictedtowithin3%ofdesiredlimitswithminimalrun-timeoverhead. 1 Introduction Theincreasingavailabilityofnetwork-basedservicesandthegrowing popularityofmobilecomputinghas resulted in a situation where current-day distributed applicationsare often built up (possibly dynamically) from components originating from a variety of sources. Since an end-user cannot be expected to trust all ofthesesources,thereisanincreasingdemandforexecution“sandboxes”—environments thatimposeirre- vocablequalitativeandquantitative restrictionsonresourceusage. Forexample,theexecutionenvironment canensurethattheapplicationcomponentcanonlyaccesscertainportionsofthefilesystem(e.g.,/tmp),or thatrunningthecomponentwouldconsumenomorethan20%oftheCPUshare. Theserestrictionsisolate thebehaviorofotheractivitiesonthesystemfromapotentiallymaliciouscomponent,andareaprecondition forthewiderdeploymentofdistributedcomponent-basedapplications. Existingapproachesforenforcingqualitativeandquantitativerestrictionsonresourceusagecanbeclas- sifiedintotwobroadclasses: thosethatverifyapplicationcompliancetorestrictionsatstarttimeandthose thatenforceitatruntime. Examplesofthefirstclassincludeapproachesthatrelyoncertifiedcode[16,17] or language-based protection [1, 3]. These approaches have the limitation of lacking generality (because of reliance on specific programming languages and compilers) and are typically unable to enforce quanti- tative restrictions. Examples of the second class include approaches that rely on kernel support [13, 15], 1 Report Documentation Page Form Approved OMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number. 1. REPORT DATE 3. DATES COVERED 1999 2. REPORT TYPE - 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Secure, User-level Resource-constrained Sandboxing 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION Defense Advanced Research Projects Agency,3701 North Fairfax REPORT NUMBER Drive,Arlington,VA,22203-1714 9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSOR/MONITOR’S ACRONYM(S) 11. SPONSOR/MONITOR’S REPORT NUMBER(S) 12. DISTRIBUTION/AVAILABILITY STATEMENT Approved for public release; distribution unlimited 13. SUPPLEMENTARY NOTES The original document contains color images. 14. ABSTRACT see report 15. SUBJECT TERMS 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF 18. NUMBER 19a. NAME OF ABSTRACT OF PAGES RESPONSIBLE PERSON a. REPORT b. ABSTRACT c. THIS PAGE 20 unclassified unclassified unclassified Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 binary modification [18], or active interception of the application’s interactions with the operating system (OS)[2,8,9]forisolatingresourceusage. Thekernelapproachesaregeneral-purposebutrequireextensive modificationstoOSstructureandlackflexibilitywithrespecttowhatrestrictionsareimposed. Theremain- deroftheapproachesrelyondecidingforeachapplicationinteractionwiththeunderlyingsystemwhether or not to permitthis interactionto proceed; consequently, they provide some flexibilitywith respectto en- forcing qualitative restrictionsbut are unable to handle most kinds of quantitative restrictions (particularly sinceusageofsomeresources,e.g. theCPU,doesnotrequireexplicitapplicationrequests). Thispaperpresentsasecure,user-level sandboxingapproachforenforcingbothqualitative andquanti- tative restrictionsonresourceusageofgeneral-purposeapplicationsindistributed systems. Thequalitative aspectissimilartoprevioussystemsreferredtoabove;whatisnovelhowever,istheabilityofourapproach to enforce quantitative constraints. Our approach actively monitors an application’s interactions with the underlying system, proactively controlling it as desired to enforce the desired behavior. The security of theapproachstemsfromimplementationmechanismsthatpreventamaliciousapplicationfrombeingable to undo the monitoring and control. Our general strategy recognizes that application access to system re- sources can be modeled as a sequence of requests (either implicit such as for a physical memory page, or explicit such as for a disk operation) spread out over time. This observation provides two alternatives for constrainingresourceutilization: eithercontroltheresourcesavailabletotheapplicationatthepointofthe requestor(inthecaseofresourceswithrateconstraintssuchasCPUandmemory)controlthetimeinterval betweenresourcerequests. Inbothcasesandforallkindsofresources,thespecificcontrolisinfluencedby theextenttowhichtheapplicationhasexceededorfallenbehindaprogressmetric. Thelatterrepresentsan estimateoftheresourceconsumptionoftheapplicationprogram. Although the high-level strategy is relatively straightforward, the primary challenge lies in accurately estimatingtheprogressmetricandeffectingnecessarycontrolonresourcerequestswithminimaloverhead. Itmightappearthatappropriatemonitoringandcontrolwouldrequireextensivekernelinvolvement,restrict- ingtheirapplicability. Fortunately, mostmodernOSesprovideacoresetofuser-level mechanismsthatcan be used to construct the required support. Support for fine-grained timers and monitoring infrastructures suchastheUNIX/procfilesystemandtheWindowsNTPerformanceCountersprovideneededinforma- tion for building accurate progress models. Similarly, fine-grained control can be effected using standard mechanismsfordebuggerprocesses, priority-basedscheduling, andpage-basedmemoryprotection. Wedescribeimplementationsofasandboxthat imposesquantitative restrictionsonusageofthreerep- resentative resources—CPU,memory, and network—on two commodity operating systems: Windows NT andLinux. Thetwoimplementationsutilizethesamehigh-levelstrategy,butrelyonplatform-specificmon- itoring and control mechanisms. A detailed evaluation shows that both sandbox implementations are able to restrict resource usage of unmodified applications to within 3% of the prescribed limits with minimal run-time overhead. We also present a synthetic application that demonstrates the flexibility advantages of user-levelsandboxing: inthiscase,ourapproachpermitsapplication-specificcontroloverresourceusageat granularitiessmallerthanthosecontrollableusingkernel-level mechanisms. Thespecificcontributionsofthispaperinclude: A general user-level strategy for exploiting widely available OS features to impose quantitative re- (cid:2) strictionsonresourceusage,andforsecuringthesandboxdespiteitsuser-level implementation. TwoconcreteimplementationsofthisstrategyonWindowsNTandLinuxoperatingsystemsusingas (cid:2) examplethreerepresentative resourcetypes(CPU,memory,andnetwork). Anevaluationoftheoverheadsandflexibilityoftheuser-level sandboxingapproach. (cid:2) The rest of this paper is organized as follows. Section 2 provides background and discusses related work. Section 3 presents the overall sandboxing strategy and discusses its application for three example resourcetypes: CPU,memory, andnetwork. Section4showshowtosecurethisstrategyagainstmalicious 2 applications. ConcreteimplementationsofthesandboxonWindowsNTandLinuxOSesarepresentedand evaluatedinSections5 and6. Section7 highlightstheflexibilityadvantages ofuser-level sandboxing,and weconcludeinSection8. 2 Background and Related Work Theproblemofensuringthatuntrustedapplicationcomponentsinadistributedsystemdonotviolatecertain qualitative and quantitative restrictionsonresourceusagehas recently attracteda lot ofattention. Existing approachescanbeclassifiedintotwobroadclasses: thosethatverifyapplicationcompliancetorestrictions atstarttimeandthosethatenforceitatruntime. 2.1 Enforcing complianceatstarttime Severalmechanismscanbeusedtoverify,priortostartingexecutionofacomponent,thatthelattersatisfies therestrictionsimposeduponitbytheenvironment. Thesemechanismsinclude: relyingonacertificateauthority(e.g.,VeriSign[17])toattesttothefactthattheapplicationcomponent (cid:2) satisfiesthedesiredproperties, usinglanguage-basedprotectiontechniquessuchastypesafety[3]orstaticbytecodeverification[14] (cid:2) toverifythattheprogramdoesnotmisbehave,and using techniques such as proof-carrying code [16] that permit a verifier on the client machine to (cid:2) confirmthattheapplicationcomponentsatisfiescertainsafetyproperties. Theaboveapproachesareveryeffective atensuringthattheapplicationdoesnotviolatequalitative restric- tions (e.g., that certain types of resources are not accessed from specific code modules), but are typically unable to enforce quantitative restrictions. This is because the former can be expressed as safety proper- ties,easiertoverifystatically. Inaddition,someoftheseapproacheslackgeneralitybecauseofrelianceon specificprogramminglanguagesandcompilers. 2.2 Enforcing complianceatruntime Approachesforenforcingrun-timecomplianceofapplicationbehaviorfallintotwosub-categories: Kernel-levelmechanisms suchasCPUreservations[13,15]andfair-sharequeueingofCPU[10,19]and networkresources[6,7,20]havebeenemployed,primarilyinthecontextofreal-timeoperatingsystems,to enforce both qualitative and quantitative restrictions. Actually, such support provides a stronger guarantee ofa certainlevel ofresourceallocationover a timewindow. Somerestrictedversionsofsuchmechanisms arealsolikelytobeavailableintheformofjobcontrolmechanismsinWindows2000(announced). Although such approaches are general-purpose, they require extensive modifications to OS structure that limitstheirapplicabilityto commodityOSes. Additionally, the policy spaceofrestrictionsthat canbe imposedisinflexible,beinglimitedtothefewoptionsthathavebeendesignedintothekernel. Code transformation techniques provide a user-level approach for imposing restrictions on resource usage. Thesetechniques,whichincludebinarymodificationapproachessuchassoftwarefault-isolation[18] and API interception approaches such as Janus [9], Mediating Connectors [2], and Naccio [8], all rely on monitoringanapplication’sinteractionswiththeunderlyingOS.ThesetechniquesleverageOSmechanisms such as system-call interception by a debugger process [9], or application structuring mechanisms such as DLL import-address-table rewriting [2, 12] to execute some checking code whenever the application 3 interactswiththeOS.Thiscodedecides,forrelevant interactions,whethertoallow ordeny theinteraction fromproceeding.1 Consequently, suchapproachesprovidesomeflexibilitywithrespecttoenforcingqualitativerestrictions (e.g.,thatallmemoryloads/storesaretocertainreserved portionsoftheapplicationaddressspace),butare unabletohandlemostkindsofquantitativerestrictions. Thelatterisparticularlytruebecauseusageofsome resources,e.g. theCPU,doesnotinvolveexplicitapplicationrequests. Our approach uses the same underlying mechanisms as the code transformation techniques described above,butdiffersinitsabilitytoalsoenforcequantitative restrictionsoverresourceusage. Toachieve this, aswedescribeinthenextsection,itbuildsuponcoremonitoringandcontrolmechanismsthatareafeature ofmostmodernOSes. 3 Capacity Sandboxing: Enforcing Quantitative Restrictions Althoughoursandboxingapproachenforcesbothqualitativeandquantitativerestrictionsonresourceusage, werestrictourattentionintheremainderofthepapertoquantitative (capacity)constraints. Fortheformer, werelyontherelativelywell-understoodcodetransformationapproachthathasbeendescribedearlier. The basicideaisthataccesscontrolandotherqualitativeconstraintscanbeenforcedbyinterceptingapplication interactionswiththeunderlyingOSandappropriatelymodifyingthemtoensurecompliancewiththedesired securityproperty. Forexample,anapplicationprogramcanbepreventedfromaccessingfilesindirectories otherthanadesignatedonebyensuringthatthefopenAPIcallisinterceptedandmodifiedtoreflectafile nameandpathrelativetothisdesignateddirectory. Enforcing quantitative (or capacity) constraints (e.g., a client program does not use more than 20 MB of RAM) is more involved. This is because several system resources such as the CPU and memory can beaccessedwithoutgoingthroughahigh-level APIcallthatcanbeintercepted. Moreover, individual API calls may not provide useful information to determine whether or not the client program has exceeded its capacity constraints and what needs to be done to rectify that. In the rest of this section, we first describe thegeneralarchitectureforourcapacitysandbox,andthenexemplifyitsusage. 3.1 GeneralArchitecture To enforce quantitative restrictionson usage for resourcesof different types, our generalstrategy relies on therecognitionthatapplicationaccesstosystemresourcescanbemodeledasasequenceofrequests(either implicitsuchasforaphysicalmemorypage,orexplicitsuchasforadiskoperation)spreadoutover time. Thisobservationprovidestwoalternatives forconstrainingresourceutilization: eithercontroltheresources availabletotheapplicationatthepointoftherequest,or(inthecaseofresourceswithrateconstraintssuch as CPU, network, and disk) control the time interval between resource requests. In both cases and for all kindsofresources,thespecificcontrolisinfluencedbytheextenttowhichtheapplicationhasexceededor fallenbehindaprogressmetric. Thelatterrepresentsanestimateoftheconsumptionofaparticularkindof resourceoveratimeslot. Thisgeneralstrategyisrealizedbyrelyingupontechniquesforinstrumentingtheapplication, monitor- ingitsprogress,andasnecessary, controllingitsprogress(seeFigure1). Instrumentingallowsustomodify theapplicationcodesothatcontrolovertheapplicationispossible. Monitoringenablesustobeawareofthe current state of the application and its utilization of various resources. Finally, controlling the application progressistheproactive mechanismforenforcingquantitative restrictionsonresourceusage. Allthreesets 1Or,insomecases(e.g.,Janus[9]),tomodifytherequestintoacompliantformpriortoallowingittoproceed. 4 Application(cid:13) Instrumented(cid:13) (cid:3)application(cid:13) (cid:4)Sandbox(cid:13) Monitoring(cid:13) Controlling(cid:13) Monitoring(cid:13) Controlling(cid:13) infrastructure(cid:13) infrastructure(cid:13) Fine-grained timer(cid:13) (cid:7)/proc filesystem(cid:13) Debugger(cid:13) Page-based protection(cid:13) (cid:5)Operating(cid:13) (cid:6)system(cid:13) Performance counters(cid:13) Signal/exception handling(cid:13) Dynamic-library interface(cid:13) Priority-based process scheduling(cid:13) Figure1: Generalsandboxingstrategythatimposesquantitative restrictionsonresourceusage. oftechniquesleveragea coresetofuser-level mechanismsthatare provided bymostmodernOSes. Inthe followingdiscussion,thesemechanismsareshownunderlinedandinbold. Instrumentingtheapplication Instrumentationreferstothetechniqueofmodifyingtheapplicationcode on the fly, without having to recompile or relink the application. It leverages the fact that modern OSes provideasignificantportionoftheirfunctionalityassharedlibrarieswhoseinterfacesarewelldefined. The application interactions at this interface can be intercepted using OS-specific techniques. The latter can rangefromsomethingassimpleaslibrarypreloadingonLinuxtomoreOS-supportedmechanismssuchas theptracefacilityonseveralUnix-basedOSesthatpermitsadebuggerprocesstobeinformedwhenever theprocessbeingdebuggedmakesasystemcall. SimilarmechanismsexistontheWindowsNTOSranging from rewrites to the process DLL import address table [2] to more extensive modifications to the process addressspace[12]usingadebuggerprocesscapabilities. Interceptionoftheapplicationinteractionspermits theinjectionofcode,whichcanmonitorandcontrolapplicationbehavior(inparticular,resourceutilization). Our approach relies on instrumentation to enable a distinguished process, the monitor, to load the ap- plicationcomponentsthat needto runwithinasandboxandbeableto monitorandcontrolthebehavior of theseprocessesbyinjectingrequiredfunctionalityintotheiraddressspace. Themonitoralsomapsashared memorysegmentforcoordinationwiththeinjectedcode. Monitoring progress Monitoring, as well as accounting, of application behavior is well-supported on modernOSes. Extensiveinformationaboutprocessusageofvariousresources(e.g.,CPU,physicalmemory) canbeobtainedthroughvariousmonitoringinfrastructures. Thelatterrangefromspecialsystemcalls,to consultingtheregistryofthedynamicperformancedata(onWindowsNT),toafile-systembasedinterface tothisdata(e.g.,the/procmechanismsonUnixsystems). Althoughourstrategyreliesontheavailability of such mechanisms to obtain progress data, in some cases this data is not very up-to-date or incurs high overheadsforitsupdate. Inthosesituations(e.g.,decidingwhetherornotaprocessiswaitingforasystem 5 event),thedebuggerinterfacepermitsfasteraccesstothedesiredinformation. An additional issue is that most of the mechanisms referred to above only provide accumulated infor- mation(sincethestartoftheprocess). So,thisinformationisperiodicallysampledwiththesamplingperiod automaticallyadaptedtotherateatwhichtheapplicationconsumesrequests. Forexample,usageofnetwork resourcescanbemonitored/updatedwhenevertheapplicationexecutesanetworkingAPIcall. Forresources thatarenotaccessedthroughexplicitAPIcalls(e.g.,CPUandmemory),themonitoringinformationcanbe periodicallyupdatedusingsupportforfine-grainedtimers. Thelatterallowassociationofperiodicactivity atagranularityofsub-10msoncurrent-dayOSes. Controllingprogress Upondetectingthattheprogressmetricforanapplicationcomponenthasexceeded orfallenbehindaresourcethreshold,ourstrategycontrolsapplicationbehaviortoenforcecompliance. The actualmechanismusedforcontrollingprogressdependsuponwhetherornot applicationuseofaresource involvesanexplicitAPIcall. Ifitdoes(e.g.,foruseofnetworkordiskresources),appropriatecontrolcanbe achieved bylimitingtheresourcesavailableat thepointofrequestorbyvaryingthe timeinterval between resourcerequests. Thelatterreliesonfine-grainedprocesssleepoperationsonmodernOSes. When resource usage is implicit, our strategy relies on two sets of mechanisms that: (1) (for CPU) control how frequently the application is scheduled (thereby effecting a delay between resource usage re- quests)byleveragingsupportforpriority-basedprocessscheduling,and(2)(formemory)returnallocated resources backto the OS using a resource-specificprotocol. Forinstance, memoryresources can be relin- quished by setting page protection bits to NoAccess (on NT), or unmapping appropriate portions of the virtualaddressspace. Thecontrollingcodemustalsoensurethattheapplicationcontinuestofunctioncor- rectlydespitethislossofresources. Inthecaseofmemoryresources,thecontrollertakesadvantageofOS supportforpage-basedprotectionanduser-levelprotectionfaulthandlerstoinvoke thisfunctionality. 3.2 ConstrainingUsageforDifferent Resource Types We considerthree representative resources—CPU,memory, and network—to illustrate the above strategy. ImplementationdetailsonWindowsNTandLinuxaredeferredtoSections5and6. CPU Resources Here, the quantitative restrictionis to ensure that the application receives a stable, pre- dictable processor share. From the application’s perspective, it should appear as if it were executing on a virtualprocessorofacertainspeed. Constraining CPU usage of an application utilizes the general strategy described earlier. The monitor process periodically samples the underlying performance monitoring infrastructure to estimate a progress metric. In this case, progress can be defined as the portion of its CPU requirement that has been satisfied over a period of time. This metric can be calculated as the ratio of the allocated CPU time to the total timethisapplicationhasbeenreadyforexecutioninthisperiod. However,althoughmostOSesprovidethe former information, they do not yield much information on the latter. This is because few OS monitoring infrastructures distinguish (in what gets recorded) between time periods where the process is waiting for a system event and where it is ready waiting for another process to yield the CPU. To model the virtual processorbehavior ofanapplicationwithwaittimes(seeFigure2foradepictionofthedesiredbehavior), we estimate the total time the application is in a wait state using a heuristic. The heuristic periodically checkstheprocessstateeitherbyqueryingthemonitoringinfrastructureorbyinspectingtheprocessstacks, andassumesthattheprocesshasbeeninthesamestatefortheentiretimesincethepreviouscheck. The actual CPU share allocated to the application is controlled by periodically determining whether the granted CPU share exceeds or falls behind the desired threshold. The guiding principle is that if other applicationstakeupexcessiveCPUattheexpenseofthesandboxedapplication,themonitorcompensatesby 6 1.0(cid:13) 1.0(cid:13) 1.0(cid:13) 1.0(cid:13) (cid:8)0.5(cid:13) (cid:8)0.5(cid:13) (cid:8)0.5(cid:13) (cid:8)0.5(cid:13) (cid:8)0(cid:13) (cid:8)0(cid:13) (cid:8)0(cid:13) (cid:8)0(cid:13) (cid:8)0(cid:13) (cid:9)t(cid:13) 2t(cid:13) (cid:8)0(cid:13) (cid:9)t(cid:13) (cid:10)2t(cid:13) (cid:8)0(cid:13) (cid:9)t(cid:13) 2t(cid:13) (cid:8)0(cid:13) (cid:9)t(cid:13) (cid:10)2t(cid:13) (a) (b) Figure 2: Desired effects on application execution time under a resource-constrained sandbox that limits CPUshare( inthiscase)whentheapplicationcontains(a)nowaitstates,and(b)withwaitstates. (cid:11)(cid:13)(cid:12)(cid:15)(cid:14) givingtheapplicationahighershareoftheCPUthanwhathasbeenrequested. However,iftheapplication’s CPU usage exceeds the requested processor share, the monitor would reduce the CPU quantum it gets for a while, until the average utilization drops down to the requested level. The scope of these adjustments (i.e.,lifetimeoftheapplication)needstobelargerthanthetimeperiodbetweensamplingpointswherethe progressmetricisrecomputed. Memory Resources The quantitative restriction of interest here is the amount of physical memory an applicationcanuse. Thesandboxwouldensurethatphysicalmemoryallocatedtotheapplicationdoesnot exceed a prescribed threshold. Monitoring the amount of physical memory allocated to an application is straightforward. ThemonitoringinfrastructureonallmodernOSesprovidesthisinformationintheformof theprocessresidentsetsize. Monitor(cid:13) Exception handler(cid:13) faults(cid:13) Working Set(cid:13) No access(cid:13) add(cid:13) remove(cid:13) access(cid:13) access(cid:13) (cid:16)Application(cid:13) Figure3: Ageneraluser-level strategyforcontrollingapplicationphysicalmemoryusage. However, itismoreinvolved tocontroltheapplicationbehaviorincasetheOSallocatesmorephysical pages than the threshold. The problem is that these resources are allocated implicitly subject to the OS memorymanagementpolicies. Thebasicideaisto have themonitoractasauser-level pagerontopofthe OS-level pager, relyingonanOS-specificprotocolforvoluntarily relinquishingthe surplusphysicalmem- 7 orypagesallocatedtotheapplication(seeFigure3). Also,unlike theCPUcasewhereperiodicmonitoring and controlofapplicationprogressis required, herethemonitoringand controlcanadapt itselfto applica- tion behavior. The latter is required only if the applicationphysical memoryusage exceeds the prescribed threshold,whichinturncanbedetectedbyexploitingOSsupportforuser-level protectionfaulthandlers. NetworkResources Here,thequantitative restrictionreferstothesendingorreceiving bandwidthavail- able to the application on its network connections. Unlike CPU and memory resources, application usage of network resources involves an explicit API request. This permits the monitoring code injected into the applicationto keep track oftheamountofdatasentover a timewindow andestimatethebandwidthavail- abletotheapplication. Controlisequallystraightforward: iftheapplicationisseentoexceeditsbandwidth threshold, it can be made compliantby just stretching out the datatransmission or reception over a longer timeperiod(e.g.,byusingfine-grainedsleeps). Although the above description works well for controlling bandwidth into and out of an end-point,2 a differentmonitoringandcontrollingapproachisrequiredwhenthesandboxenvironmentmustbeextended tocontrolnetwork bandwidthforconnectionsbetweenmultiplehosts. Inthiscase,simplemodelingofthe bandwidthresultsin a situationwhere burstsin network traffic arenot modeledaccurately(since theseget smoothedoutateachend-point). Inordertocontrolthenetworkbandwidth ,weneedtoknowwhendata (cid:17) is sentand itssize. The ideais asfollows. When theapplicationperformsasend(), we inspectthe current networkusageanddecidetheemulatedamountofbandwidththissend()canget. Incasetherearenoother “pending”send()s,thisvalueis ,butifthereareotherprecedingsend()sthatstilltakeupsomebandwidth, (cid:17) a lower rate might be given so that the overall sending bandwidth is . Knowledge of the emulated (cid:17)(cid:19)(cid:18) (cid:17) bandwidth allows computation of an appropriate delay parameter that models the message send being (cid:17) (cid:18) stretchedoutbytherightamount. Afterthisdelay, theentiremessageis sentoutinitsoriginalform. With thisscheme,thereceiver willobserve thesamebandwidthasthesender, whichmightbesufficientincases wherethereislowcontentioninthecommunicationpatternbetweenhosts. (cid:20)Send(cid:13) (cid:20)Sender(cid:13) (cid:20)Sandbox(cid:13) C o D ntrol msg(cid:13) ata msg(cid:13) (cid:20)Sandbox(cid:13) Receiver(cid:13) Recv(cid:13) (cid:21)data msg size/bandwidth(cid:13) Figure4: Ageneralstrategyforimposingglobalnetworkbandwidthcontrolfordistributedapplications. In the presence of contention, limiting only the sending bandwidth is not enough. The receiver might acceptconnectionsfrommultiplehosts,thusitisnecessarytoconstraintheincomingmessagerateaccording to the receiving bandwidth restrictions, as in a real network. Limiting the receiving bandwidth requires computationofbandwidthfromthestartofthesendoperationtotheendofthereceiveoperation. However, asstatedabove,thereisnowayforthemonitoring/controllingcodeonthereceivertoknowwhenaparticular messagewassent(sincethereisnowaytofactoroutcontentioninthenetwork). Theidealsolutionwouldbe 2Forclarityofdescription,werestrictourattentioninthispapertosynchronouscommunicationoperationsandalsoassumethat thedatatransmissionrateinthenetworkisnotthebottleneck.Theapproachneedstoberefinedslightlytohandlesituationswhere communicationoperationsareasynchronous. 8 totimestampeachmessagewithaglobalclockandhave thereceiving codeusethistimestamptocompute emulatedbandwidth. However, given the infeasibilityand highoverheads ofbuilding synchronizedclocks in a distributed system, we approximateitsbehavior by having the monitoringcode periodicallyexchange control messages (see Figure 4). Each control message defines the extent of a virtual time window on the originatinghost, and indicateshow much datahas beensent to the peer in the previous time window. The receiving node records the local time of reception of a control message and uses it to infer whether or not data messages it has received need to be delayed to emulate the desired bandwidth constraints. Note that thisschemeis robust to controlmessagesarrivinglaterthansomeofthedatamessageswhoseinformation they carry: this would be taken to imply that the receive has in fact been issued in time and any required delaysarejustaddedontofuturemessagereceptions. In Sections 5 and 6, we describe concrete implementations and performance of the strategy described hereontwocommodityOSes: WindowsNTandLinux. 4 Securing the Sandbox Thesandboxingstrategy describedintheprevioussectionissecureifthemonitoringandcontrollingfunc- tionalityis embeddedin the debugger process that loadsthe client programbeing sandboxed. In this case, traditional OS process protection mechanisms and the asymmetric relationship between the debugger and debuggeeprocessesaresufficienttoensurethattheclientprogramcannotundothequalitativeandquantita- tiverestrictionsimposeduponit. However, on some operating systems and more generally for performance reasons (to minimize sand- boxing overheads) we might want to inject some of the monitoring and controlling functionality into the clientprocessitself. Thisleadstoapotentialvulnerability: sincethisfunctionalityispartoftheuseraddress space, amaliciousclientprogrammightbeableto undothe restrictions. We have developed astrategy for securing the sandbox despite its user-level nature. For space reasons, we only sketch the overall scheme here. Acompletedescriptionandevaluationoftheschemeisthesubjectofaforthcomingpaper[4]. Ouruser-levelsandboxingstrategyguaranteesirrevocabilitybyexploitingtheobservationthatthesand- boxing code gets a chance to initialize itself before the client program. Therefore, it is possible for the sandboxingcodetofirstmodifythecodeimagesintheclientprogram’svirtualaddressspaceasappropriate andthenleave itinastatesuchthatneithertheclientprogram northesandboxingcodeitselfcanundothe effectsofthismodification. Inmoredetail,therearetwomainthreatsthatthesandboxingcodemustcounter(seeFigure5): 1. Preventtheclientprogramfromundoinganychangestothecodeimagesinitsvirtualaddressspace. 2. Preventtheclientprogramfrombypassingtheinterceptedpathforinvokingoperatingsystemservices. Preventing modificationto sandboxingcode Toensure that the first threat doesnot succeed, the sand- boxing code can write the modified code images, protect the pages containing these modified images to preventfurthermodification,andfinallypreventtheclientprogramfromresettingtheseprotectionbits. The latterintroducesabootstrappingproblembecausethesandboxingprogrammustinitiallybeallowedtoreset protectionbits. This can be resolved by modifyingthe API call responsible for changing page protections (mprotectinUnix)inawaysuchthatafterdesiredmodificationshavebeeneffected,thisAPIcallatomi- callyswitchesintoamodewherefurtherillegalpageprotectionchangesaredisallowed. Ourimplementation reliesonself-modifyingcodetoachievethis. Preventing bypassing of sandboxing code To prevent the second threat from succeeding, we need to ensurethattheclientprogram(a)doesnotinvokethesamefunctionalitywhenloadedintoadifferentportion 9