ebook img

New Approaches to Website Fingerprinting Defenses PDF

0.62 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview New Approaches to Website Fingerprinting Defenses

1 New Approaches to Website Fingerprinting Defenses Xiang Cai, Rishab Nithyanand, Rob Johnson Stony Brook University Abstract—Website fingerprinting attacks[10] enable an adver- sensitivityandsomerateadaptation,buttheyleftmanydetails sarytoinferwhichwebsiteavictimisvisiting,evenifthevictim unspecified and did not implement or evaluate their scheme. uses an encrypting proxy, such as Tor[19]. Previous work has Despite the lack of data on CS-BuFLO, the Tor project has 4 shown that all proposed defenses against website fingerprinting indicated interest in incorporating CS-BuFLO into the Tor 1 attacks are ineffective[5], [3]. This paper advances the study 0 of website fingerprinting attacks and defenses in two ways. browser [17], [16]. 2 First, we develop bounds on the trade-off between security and In order to get a better understanding of the performance bandwidth overhead that any fingerprinting defense scheme can and security of the CS-BuFLO protocol, this paper presents a n achieve. This enables us to compare schemes with different completespecificationofCS-BuFLO,describesanSSH-based a security/overhead trade-offs by comparing how close they are J implementation,andevaluatesitsbandwidthoverhead,latency to the lower bound. We then refine, implement, and evaluate 3 the Congestion-Sensitive BuFLO scheme outlined by Cai, et overhead, and security against the current best-known attacks. 2 al.[3].CS-BuFLO,whichisbasedontheprovably-secureBuFLO Cai’s description of the CS-BuFLO protocol outlines solu- defense proposed by Dyer, et al.[5], was not fully-specified by tions to several performance and practicality problems in the ] Cai, et al, but has nonetheless attracted the attention of the R originalBuFLOprotocol–CS-BuFLOisTCP-friendly,itpads Tordevelopers[16],[17].OurexperimentsfindthatCongestion- streams in a uniform way, and it uses information collected C Sensitive BuFLO has high overhead (around 2.3-2.8x) but can . get 6× closer to the bandwidth/security trade-off lower bound offline to tune BuFLO’s parameters to the website being s than Tor or plain SSH. loaded. We propose two further improvements: we modify c [ CS-BuFLOtoadaptitstransmissionratedynamically,andwe improveitsstreampaddingtouselessbandwidthwhilehiding 1 I. INTRODUCTION more information about the website being loaded. Dynamic v 2 Website fingerprinting attacks have emerged as a serious rate adaptation makes CS-BuFLO much more practical to 2 threat against web browsing privacy mechanisms, such as deploy,sinceitdoesnotrequireaninfrastructureforperform- 0 SSL, Tor, and encrypting tunnels. These privacy mechanisms ing offline collection of statistics about websites, but poses a 6 encrypt the content transferred between the web server and challenge: adapting too quickly to the website’s transmission . 1 client, but they do not effectively hide the size, timing, and rate can reveal information about which website the victim is 0 directionofpackets.Awebsitefingerprintingattackusesthese visiting. CS-BuFLO balances these performance and security 4 features to infer the web page being loaded by a client. constraints by limiting the rate and precision of adaptation. 1 Researchers have engaged in a war of escalation in de- We have implemented CS-BuFLO in a custom version : v veloping website fingerprinting attacks and defenses, with of OpenSSH. Our implementation also includes a Firefox i X two recent papers demonstrating that all previously-proposed browser plugin that informs the SSH client when the browser defenses provide little security[5], [3]. At the 2012 Oakland has finished loading a web page. The CS-BuFLO implemen- r a conference, Dyer, et al. showed that an attacker could infer, tation uses this information to reduce the amount of padding with a success rate over 80%, which of 128 pages a victim performed after the page load has completed. was visiting, even if the victim used network-level counter- We evaluate CS-BuFLO, and compare it to Tor, on the measures. They also performed a simulation-based evaluation Alexatop200websitesintheclosed-worldsetting.TheAlexa of a hypothetical defense, which they call BuFLO, and found top 200 websites represent approximately 91% of page loads that it required over 400% bandwidth overhead in order to on the internet [1], so these results reflect the security users reduce the success rate of the best attack to 5%, which is will obtain when using these schemes in the real world. still well-above the ideal 0.7% success rate from random Furthermore, prior work on website fingerprinting attacks has guessing. At CCS 2012, Cai et al. proposed the DLSVM found that an attackers success rate only goes down as the fingerprinting attack and demonstrated that it could achieve a number of websites increases, so our results give a high- greater than 75% success rate against numerous defenses[3], confidenceupperboundsonthesuccessratetheseattacksmay includingapplication-leveldefenses,suchasHTTPOS[13]and achieve in larger settings. randomized pipelining[15]. As a result, it is not currently In our experiments, CS-BuFLO uses 2.8 times as much known whether there exists any efficient and secure defense bandwidthasSSH(i.e.nodefense)andthebestknownattack against website fingerprinting attacks. hadonlya20%successrateatinferringwhichof200websites Cai, et al. also proposed Congestion-Sensitive BuFLO, a victim was visiting. This is a substantial improvement over which extended Dyer’s BuFLO scheme to include congestion previously-proposed schemes – the same attack had a success 2 Defense n Method Source Panchenko VNG++ DLSVM BWRatio LatencyRatio CS-BuFLO(CTSP) 200 Empirical thispaper 18.0 13.0 20.6 2.796 3.271 CS-BuFLO(CPSP) 200 Empirical thispaper 24.2 16.5 34.3 2.289 2.708 CS-BuFLO(CTSP) 120 Empirical thispaper 23.4 20.9 28.9 2.799 3.444 CS-BuFLO(CPSP) 120 Empirical thispaper 30.6 22.5 40.5 2.300 2.733 BuFLO(τ =0,ρ=40,d=1000) 128 Simulation [5] 27.3 22.0 N/A 1.935 N/A BuFLO(τ =0,ρ=40,d=1500) 128 Simulation [5] 23.3 18.3 N/A 2.200 N/A BuFLO(τ =0,ρ=20,d=1000) 128 Simulation [5] 20.9 15.6 N/A 2.405 N/A BuFLO(τ =0,ρ=20,d=1500) 128 Simulation [5] 24.1 18.4 N/A 3.013 N/A BuFLO(τ =105,ρ=40,d=1000) 128 Simulation [5] 14.1 12.5 N/A 2.292 N/A BuFLO(τ =105,ρ=40,d=1500) 128 Simulation [5] 9.4 8.2 N/A 2.975 N/A BuFLO(τ =105,ρ=20,d=1000) 128 Simulation [5] 7.3 5.9 N/A 4.645 N/A BuFLO(τ =105,ρ=20,d=1500) 128 Simulation [5] 5.1 4.1 N/A 5.188 N/A HTTPOS 100 Empirical [3] 57.4 N/A 75.8 1.361 N/A Tor+rand.pipe. 100 Empirical [3] 62.8 N/A 87.3 1.745 N/A Tor 100 Empirical [3] 65.4 N/A 83.7 N/A N/A Tor 120 Empirical thispaper 56.3 36.8 77.4 1.247 4.583a Tor 200 Empirical thispaper 50.1 31.8 75.1 1.244 4.919 Tor 775 Empirical [14] 54.6 N/A N/A N/A N/A Tor 800 Empirical [3] 40.1 N/A 50.6 N/A N/A SSH 120 Empirical thispaper 86.5 75.0 80.7 1.128 1 SSH 200 Empirical thispaper 84.4 72.9 79.4 1.111 1 aNotethatthehighlatencyofTORislargelyduetoitsonionroutingprotocols–acostthatotherdefensesdonotincur. TABLEI MAINEVALUATIONRESULTSFORCS-BUFLO,ANDCOMPARISONTORESULTSONOTHERSCHEMESREPORTEDINOTHERPAPERS. rateover75%againstTorandSSHunderthesameconditions. incur on those websites. This enables us to compare defenses Table I compares our results with results reported in other thatofferdifferentsecurity/bandwidthtrade-offsbycomparing papers. These comparisons must be done carefully, since the how close they are to the lower bound. experiments used different numbers of websites and method- Thepaperconcludesbyusingthelowerboundstocompare ologies.Nonetheless,thefollowingconclusionsareclearfrom defenses that offer different bandwidth/security trade-offs. We the data: find that Congestion-Sensitive BuFLO gets over 6× closer • CS-BuFLO hides more information than Tor, SSH, to the bandwidth/security trade-off lower bound than Tor or HTTPOS, and Tor with randomized pipelining, albeit plain SSH. Dyer’s reported experiments with BuFLO showed with higher cost. For example, the DLSVM attack has a somewhat better trade-off performance, but those results were lower success rate against CS-BuFLO in a closed-world basedonsimulationsandarenotdirectlycomparable.Despite experimentwith100websitesthanithasagainstTorwith theimprovementofCS-BuFLOoverTorandSSH,thereisstill 800 websites. a large gap between the lower bounds and the best defenses. • Overall, CS-BuFLO achieves approximately the same bandwidth/security trade-off in our empirical analysis In summary, this paper makes the following contributions: as BuFLO achieved in Dyer’s simulated evaluation. For example, CS-BuFLO in CTSP mode had a bandwidth ratio of 2.8 and Panchenko’s attack had a success rate of • Section IV provides the first analytical results on the 23.4%on120websites.BuFLOwithτ =0,ρ=40,and website fingerprinting defense problem, showing that d=1500 had almost identical security, but a bandwidth constructinganoptimaldefenseisNP-hardanddiscover- ratioof2.2.AlthoughCS-BuFLOoptimizesmanyaspects ing lower bounds on the best possible trade-off between of the BuFLO protocol, an empirical evaluation presents bandwidth and security. issues that do not arise in a simulation, such as dropped • Section V gives a complete specification of the CS- packets,retransmissions,andapplication-leveltimingde- BuFLO protocol, describing optimizations to make the pendencies. protocol congestion sensitive, rate adaptive, and efficient In addition to the empirical work on CS-BuFLO, this paper athidingmacroscopicwebsitefeatures,suchastotalsize provides an analytical study of the problem of defending and the size of the last object. againstwebsitefingerprintingattacks.Weshowthatconstruct- • Section VI describes our prototype implementation in ing an optimally-efficient defense scheme for a given set SSH, which also includes a Firefox plugin to notify the of websites is an NP-hard problem. We then develop lower proxy when the browser finishes loading a web page. bounds on the best possible trade-off between security and • Section VII presents empirical evaluation results for CS- overhead that any website fingerprinting defense can achieve. BuFLO, Tor, and SSH, and shows that CS-BuFLO pro- Specifically, given a set of websites and a desired security videsbettersecurity,albeitathigherbandwidthcosts.We level, we can compute a lower bound on the bandwidth also show that CS-BuFLO is closer to the lower bound overhead that any defense scheme with that security level can on the security/bandwidth trade-off than Tor and SSH. 3 User Web Page Encrypted Channel Anonymizing Anonymizing Monitoring Service: Client Service: Server Traffic Attacker Fig.1. Websitefingerprintingattackthreatmodel. II. RELATEDWORK to capture higher-level information about the HTTP protocol, and achieved good success against Tor[14]. Dyer, et al. per- Defenses: Network-level website fingerprinting defenses formed a comprehensive evaluation of attacks and defenses, pad packets, split packets into multiple packets, or insert and developed their own attack, called VNG++, that achieved dummy packets. Dyer, et al., list numerous approaches to good success against many network-level defenses[5]. Cai, et padding individual packets, including pad-to-MTU, pad-to- al., proposed an attack, based on string edit distance, that power-of-two,randompadding,etc.[5].Theyshowedthatnone performs well against a wide variety of defenses, included of the padding schemes was effective against the attacks they application-level defenses, such as HTTPOS and Tor’s ran- evaluated. Wright, et al., proposed traffic morphing, in which domized pipelining[3]. Wang, et al. improved this attack’s packets are padded and/or fragmented so that they conform to performance against Tor by incorporating information about a specified target distribution[21]. Dyer, et al., defeated this the structure of the Tor protocol [20]. Danezis, Yu, et al., and defense, as well[5]. Lu, et al., extended traffic morphing to Cai, et al., all proposed to use HMMs to extend web page operate on n-grams of packet sizes, i.e. their scheme pads fingerprinting attacks to web site fingerprinting attacks[4], and fragments packets so that n-grams of packet sizes match [22], [3]. a target distribution[12]. Dyer, et al. also proposed BuFLO, which pads or fragments all packets to a fixed size, sends packets at fixed intervals, injecting dummy packets when III. WEBSITEFINGERPRINTINGATTACKS necessary, and always transmits for at least a fixed amount of In a website fingerprinting attack, an adversary is able to time[5]. They found that they could reduce their best attack’s monitorthecommunicationsbetweenavictim’scomputerand success rate to 5% (when guessing from 128 websites), at a a private web browsing proxy, as shown in Figure 1. The bandwidth overhead of 400%. Fu, et al., found in early work private browsing proxy may be an SSH proxy, VPN server, that changes in CPU load can cause slight variations in the Tor, or other privacy service. The traffic between the user and time between packets in schemes that attempt to send packets proxy is encrypted, so the attacker can only see the timing, at fixed intervals, and recommended randomized inter-packet direction,andsizeofpacketsexchangedbetweentheuserand intervals instead[6]. the proxy. Based on this information, the attacker attempts to Application-level defenses alter the sequence of HTTP infer the website(s) that the user is visiting via the proxy. The requests and responses to further obfuscate the user’s activity. attacker can prepare for the attack by collecting information For example, HTTPOS uses HTTP pipelining, HTTP Range about websites in advance. For example, he can visit websites requests,dummyrequests,extraneousHTTPheaders,multiple using the same privacy service as the victim, collecting a set TCP connections, and munges TCP window sizes and max- of website “fingerprints”, which he later uses to recognize the imum segment size (MSS) fields[13]. Tor has also released victim’s site. an experimental version of Firefox that randomizes the order Website fingerprinting attacks are an important class of in which embedded objects are requested, and the level of attacks on private browsing systems. For example, Tor states pipelining used by the browser during the requests[15]. Both thatit“preventsanyonefromlearningyourlocationorbrows- schemes were defeated by Cai, et al[3]. ing habits.”[19] Successful fingerprinting attacks undermine Attacks: Researchershaveproposednumerousattackson this security goal. Fingerprinting attacks are also a natural basic encrypting tunnels, such as HTTPS, link-level encryp- fit for governments that monitor their citizens’ web browsing tion, VPNs, and IPSec[2], [4], [8], [9], [10], [11], [12], [18], habits. The government may choose not to (or be unable [22], [23], [5]. These attacks focus primarily on packet sizes, to) block the privacy service, but nonetheless wish to infer which carry a lot of information when no padding scheme citizens’activitieswhenusingtheservice.Sinceitcanmonitor is in use. Herrmann, et al., developed an attack based on internationalnetworkconnections,thegovernmentisinagood packetsizesthatworkedwellonsimpleencryptingtunnels[9], position to mount website fingerprinting attacks. but performed quite poorly against Tor, which transmits data Researchers have proposed two scenarios for evaluating in 512-byte cells. Panchenko, et al., designed an attack that websitefingerprintingattacksanddefenses:closed-worldmod- used packet sizes, along with some ad hoc features designed els and open-world models. A closed-world model consists 4 of a finite number, n, of web pages. Typical values of n traces include the time, direction, and content of each packet. used in past work range from 100 to 800 [5], [3], [14]. The Since cryptographic attacks are out of scope for this paper, attacker can collect traces and train his attack on the websites we assume any encryption functions used by the defense intheworld.Thevictimthenselectsonewebsiteuniformlyat scheme are information-theoretically secure. The probability random, loads it using some defense mechanism, such as Tor distribution of TD captures variations in network conditions, w or SSH, and the attacker attempts to guess which website the changes in dynamically-generated web pages, randomness victim loaded. The key performance metric is the attacker’s in the browser, and randomness in the defense system. We average success rate. assume the attacker knows the distribution of W and TD for w In an open-world model, there is a population of victims, every w, so the optimal attacker, A, upon observing trace t, each of which may visit any website in the real world, and always outputs mayselectthewebsiteusingaprobabilitydistributionoftheir A(t)=argmaxPr[W =w]Pr(cid:2)TD =t(cid:3) choice. The attacker does not know any individual victim’s w w distribution over websites, but has aggregate statistics about If more than one w attains the maximum, then the attacker websitepopularity.Theattacker’sgoalistoinferwhichofthe chooses randomly among them. victims are visiting a particular “website of interest”, i.e. an Some privacy applications require good worst-case perfor- illegal or censored site. In this case, the primary evaluation mance,andsomeonlyrequiregoodaverage-caseperformance. criteria are false positives and false negatives. Thisleadstotwosecuritydefinitionsforwebsitefingerprinting Perry has critiqued the closed-world model for its artifi- defenses: ciality [16]. However, the two models are connected: Cai, et al., showed how to bootstrap a closed-world attack into an Definition 1. Defense D is non-uniformly (cid:15)-secure if open-world attack, such that better closed-world performance Pr(cid:2)A(TD)=W(cid:3) ≤ (cid:15). Defense D is uniformly (cid:15)-secure if yields better open-world performance [3]. Thus, although maxwPrW(cid:2)A(TwD)=w(cid:3)≤(cid:15). experiments in the closed-world cannot tell us whether an These are information-theoretic security definitions – A attack or defense will be successful in the real world, we can is the optimal attacker described above. The first definition useclosed-worldexperimentstocomparedifferentattacksand says that A’s average success rate is less than (cid:15), but it does defenses. not require that every website be difficult to recognize. The second definition requires all websites to be at least (cid:15) difficult IV. THEORETICALFOUNDATIONS to recognize. All previous papers on website fingerprinting In this section we focus on understanding the relationship attacksanddefenseshavereportedaverageattacksuccessrates betweenbandwidthoverheadandsecurityguarantees.Wefirst intheclosed-worldmodel,i.e.theyhavereportednon-uniform introduce definitions of security and overhead for fingerprint- security measurements. We will do the same, although we ing defenses. We observe that the overhead required depends provide some comparison with non-uniform security bounds on the set of web sites to be protected – a set of similar web- in Section VII. sites can be protected with little overhead, a set of dissimilar To define the bandwidth overhead of a defense system, let websites requires more overhead. We then consider an offline B(t) be the total number of bytes transmitted in trace t. We version of the website fingerprinting defense problem, i.e. the define the bandwidth ratio of defense D as defense system knows, in advance, the set of websites that E(cid:2)B(cid:0)TD(cid:1)(cid:3) the user may visit and the packet traces that each website BWRatioD(W)= E[B(TW)] W may generate. We show that finding a defense system with Thisdefinitioncapturestheoverallratioofbandwidthbetween optimal overhead in this setting is NP-hard. We then develop a user using defense D for an extended period of time and a an efficient dynamic program to compute a lower bound on user visiting the same websites with no defense. the bandwidth overhead of any fingerprinting defense scheme in the closed-world setting. We will use this algorithm to compute lower bounds on overhead for the websites used in B. Lower Bounds for Bandwidth our evaluation (see Section VII). In this section we derive an algorithm to compute, given websites w ,...,w , a lower bound for the bandwidth that 1 n A. Definitions any deterministic (cid:15)-secure fingerprinting defense can use in a closed-world experiment using w ,...,w . In a closed-world In a website fingerprinting attack, the defender selects a 1 n experiment, each website occurs with equal probability, i.e. website, w, and uses the defense mechanism to load the Pr[W =w ]= 1 for all i. website, producing a packet trace, t, that is observed by the i n To compute a lower bound on bandwidth, we consider an attacker. The attacker then attempts to guess w. adversary that looks only at the amount of data transferred by Let W be a random variable representing the URL of the the defense, i.e. an attacker A that always guesses website selected by the defender. The probability distribution S of W reflects the probability that the defender visits each A (t)=argmaxPr(cid:2)B(TD)=B(t)(cid:3) S w website. For each website, w, let TD and T be the random w w w variables representing the packet trace generated by loading Anydefensethatis(cid:15)-secureagainstanarbitraryattackermust w with and without defense system D, respectively. Packet also be at least (cid:15)-secure against A . Thus, if we can derive S 5 a lower bound on defenses that are (cid:15)-secure against A , that Algorithm 1 Algorithm to compute a lower bound on the S lower bound will apply to any (cid:15)-secure defense. bandwidthofanyofflinenon-uniformly(cid:15)securefingerprinting We make a few simplifying assumptions in order to obtain defense against AS attackers. an efficient algorithm for computing lower bounds. First, we function AS-MIN-COST(n, (cid:15), {s1,...,sn}) assume that each website has a unique fixed size, s . In our Array C[0...n(cid:15),0...n] i closedworldexperiments,wefoundthat,forjustoverhalfthe for i=0,...,n(cid:15) do webpagesinourdataset,theirsizehadanormalizedstandard C[i,0]←0 deviation of less than 0.11 across 20 loads, so we do not end for believe this assumption will significantly impact the results of for i=0,...,n do ouranalysis.Second,weassumethedefenseschemeinducesa C[0,i]←∞ deterministicmapping,b =f(s ),fromthewebsite’soriginal end for i i size to the size of the trace observed by the attacker. Finally, for i=1→n do we assume that the defense mechanism does not compress or for j =1→n(cid:15) do truncate the website, i.e. that b ≥s for all i. C[j,i]=min [(i−(cid:96))s +C[j−1,(cid:96)]] i i 1≤(cid:96)≤i−1 i Suppose f is the function induced by such a defense. end for Let F = {f(s ),...,f(s )}. For any given b ∈ F, let end for 1 n n = |f−1(b)|, i.e. the number of websites that cause the return C[n(cid:15),n] b defense mechanism to transmit b bytes. The probability that end function the attacker observes b during a closed world experiment is simply n /n, and the probability that the attacker guesses the b correct website based on observation b is 1/nb. Thus the non- Non-uniformly (cid:15)-secure partitions satisfy a slightly different uniform security of the defense scheme is recurrence. If S ,...,S is is an optimal non-uniformly 1 k k-secure partition, then S ,...,S is an optimal non- (cid:88) nb 1 = |F| unniformly k−1 -secure par1tition. Tkh−er1efore the optimal cost, b∈F n nb n C(cid:48)(k,n), sna−ti|sSfike|s the recurrence n and the uniform security is max 1/n . The bandwidth  b∈F b ns if k =1 requirements of the defense is proportional to C(cid:48)(k,n)= n k−1 (cid:88)bn . n  1≤mj≤inn−1C(cid:48)( j ,j)+(n−j)sn o.w. b b∈F Algorithm 1 shows a dynamic program for computing a lower bound on the bandwidth of any deterministic defense Let S = f−1(b). Since the defense does not compress or b that can achieve (cid:15) non-uniform security in a closed-world truncate sites, we must have b≥max s. For the purposes s∈Sb experiment on static websites with sizes s ,...,s . We use ofcomputinglowerboundsonthebandwidth,wemayaswell 1 n this algorithm to compute the lower bounds reported in Sec- assume that b=max s. Thus the function f is equivalent s∈Sb tion VII. to a partition of the set {s ,...,s }. 1 n Theseobservationsimplythattheoptimalf mustbemono- tonic. C. Security Against DLSVM Attackers We now analyze the task of defending against DLSVM- Theorem 1. The optimal f is monotonic. style attackers in the same theoretical setting as above. We Proof: Consider any partition of {s1,...,sn} into sets will show that finding the lowest-cost offline defense against S1,...,Sk. Let mi =maxs∈SiSi. Without loss of generality, aDLSVMattackerisNP-hard,viaareductionfromthebinary assume m1 ≤ m2 ≤ ··· ≤ mk. Now consider the monotonic shortestcommonsuper-sequenceproblem.Thisreductionwill allocation of traces into sets S1∗,...,Sk∗ where |Si∗| = |Si|. also show that the minimum bandwidth required by an offline Let m∗i = maxs∈Si∗s. Observe that m∗i ≤ mi for all i, i.e. defense against a DLSVM attacker is at most twice the the new allocation has lower bandwidth. bandwidth lower bound computed in the previous section. Since the number of sets in the partition and the sizes of Thisresult,alongwiththeexperimentalresultsinSectionVII, those sets are unchanged, this new allocation has the same will show that offline defenses can achieve low cost and high uniform and non-uniform security as the original, but lower security, suggesting a promising avenue for future work. bandwidth. Hence the optimal f must be monotonic. Suppose websites w ,...,w are all static and constructed 1 n We can compute the optimal partition for a given security such that loading each site requires performing a fixed, serial- parameter using a dynamic program. If S ,...,S is is an ized sequence of requests and responses, e.g. each web page 1 k optimal uniformly (cid:15)-secure partition, then so is S ,...,S . contains a javascript program that loads objects one at a time 1 k−1 Thus the cost, C((cid:15),n) of the optimal uniformly (cid:15)-secure in a fixed order. Let d [j] = 1 iff the jth byte that must be i partition satisfies the recurrence relation: transmitted to load page w is a transmission in the upstream i (cid:40) direction. ∞ if n<1/(cid:15) Loading website w via a deterministic defense mechanism C((cid:15),n)= min C((cid:15),j)+(n−j)s otherwise. i n produces a fixed trace t . Let z be the binary string defined 1≤j≤n−1/(cid:15) i i 6 by z [j]=1 iff the jth byte of t is an upstream byte. Since, BuFLO effectively hides everything about the website, i i for these websites, the defense mechanisms cannot delete or except possibly its size, but has several shortcomings: re-order bytes, we must have that di is a sub-sequence of zi. • It either completely hides the size of the website or When the victim loads a web site, producing trace t, the completelyrevealsit(±dbytes).Thusitdoesnotprovide attacker can compute the corresponding string, z. In order for the same level of security to all websites. theattackertolearnnothingaboutwhichwebpagethevictim • BuFLO has large overheads for small websites. Thus its loaded, we must have that, for all i, di is a substring of z. overhead is also unevenly distributed. Thus the defense system must compute some string, z, that • BuFLO is not TCP-friendly. In fact, it is the epitome of is simultaneously a super-sequence of d1,...,dn. Minimizing a bad network citizen. the cost of such a defense is thus equivalent to finding the • BuFLO does not adapt when the user is visiting fast or shortest common super-sequence (SCS) of d1,...,dn. This slow websites. It wastes bandwidth when loading slow problem is NP-hard[7]. sites,andcauseslargelatencywhenloadingfastwebsites. However, there is a simple 2-approximation for the binary • BuFLOmustbetunedtoeachuser’snetworkconnection. SCS problem. Let (cid:96) be the length of the longest string If the BuFLO bandwidth, 1000d B/s, exceeds the user’s ρ d1,...,dn. Their SCS must be at least (cid:96) long, but is at most connectionspeed,thenBuFLOwillincuradditionaldelay 2(cid:96) long, since every binary string of length at most (cid:96) is a without improving security. sub-sequence of (01)(cid:96). Thus for any set of static websites • Past research by Fu, et al., showed that transmitting at w1,...,wn, there exists a deterministic offline defense that fixed intervals can reveal load information at the sender, achieves(uniformornon-uniform)(cid:15)-securityagainstDLSVM- which an attacker can use to infer partial information style attackers and incurs bandwidth cost that is at most twice about the data being transmitted[6]. the bandwidth lower bound derived in the previous section. Dyer, et al., proposed BuFLO as a straw-man defense system, so it is understandable that they did not bother addressing V. CONGESTION-SENSITIVEBUFLO theseproblems.However,weshowbelowthatseveralofthese problems have common solutions, e.g. we can simultaneously Dyer, et al., described BuFLO, a hypothetical defense improve overhead and TCP-friendliness, simultaneously make scheme that hides all information about a website, except security and overhead more uniform, etc. Thus, as our evalu- possibly its size, and performed a simulation-based evaluation ation will show, CS-BuFLO may be a practical and efficient thatfoundthat,althoughBuFLOisabletooffergoodsecurity, defense for users requiring a high level of security. it incurs a high cost to do so. Further, as noted by its authors, BuFLO’s simulation based In this section, we describe Congestion-Sensitive BuFLO results “reflect an ideal implementation that assumes the (CS-BuFLO), an extension to BuFLO that includes numerous feasibility of implementing fixed packet timing intervals. This security and efficiency improvements. CS-BuFLO represents is at the very least difficult and clearly impossible for certain a new approach to the design of fingerprinting defenses. Most valuesofρ.Simulationalsoignoresthecomplexitiesofcross- previously-proposed defenses were designed in response to layer communication in the network stack” [5]. As a result, known attacks, and therefore took a black-listing approach to it remains unclear how well the defense performs in the real informationleaks,i.e.theytriedtohidespecificfeatures,such world. as packet sizes. In designing CS-BuFLO, we take a white- listing approach – we start with a design that hides all traffic B. Overview of Congestion-Sensitive BuFLO features,anditerativelyrefinethedesigntorevealcertaintraf- fic features that enable us to achieve significant performance Algorithm 2 shows the main loop of the CS-BuFLO server. improvements without significantly harming security. The client loop is similar, except for the few differences discussed throughout this section. Similar to BuFLO, CS- BuFLO delivers fixed-size chunks of data at semi-regular A. Review of BuFLO intervals.CS-BuFLOrandomizesthetimingofnetworkwrites The Buffered Fixed-Length Obfuscator (BuFLO) of Dyer, in order to counter the attack of Fu, et al.[6], but it maintains et al., transmits a packet of size d bytes every ρ milliseconds, atargetaverageinter-packettime,ρ∗.CS-BuFLOperiodically and continues doing so for at least τ milliseconds. If b < d updates ρ∗ to match its bandwidth to the rate of the sender bytes of application data are available when a packet is to (SectionV-C).Sinceupdatingρ∗ basedonthesender’sratere- be sent, then the packet is padded with d−b extra bytes of vealsinformationaboutthesender,CS-BuFLOperformsthese junk. The protocol assumes that the junk bytes are marked updates infrequently. CS-BuFLO uses TCP to be congestion so that the receiver can discard them. If the website does not friendly, and uses feedback from the TCP stack in order to finish loading within τ milliseconds, then BuFLO continues reducetheamountofjunkdataitneedstosend(SectionV-D). transmitting until the website finishes loading and then stops Also like BuFLO, CS-BuFLO transmits extra junk data after immediately. Dyer, et al., did not specify how BuFLO detects the website has finished loading in order to hide the total size when the website has finished loading. They also did not ofthewebsite.However,CS-BuFLOusesascale-independent specify how BuFLO handles bidirectional communication – padding scheme (Section V-E) and monitors the state of the presumablyindependentBuFLOinstancesarerunateachend- page loading process to avoid some unnecessary overheads point. (Section V-F). 7 packet travelling in the opposite direction that contains real payload burst of packets travelling in the same direction packet that contains only junk packet that contains only real payload packet that contains both real payload and junk 2K bytes 2k+1 bytes data sent data sent T T T T T T T T T T T T T T T T 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 time Fig.2. RateadaptationinCS-BuFLO.ρ∗ isupdatedbasedonthepacketstransmittedtotheotherendbetweenT2 andT15.Timeintervalsbetweentwo consecutive packets are stored in an array Intervals[]. The two packets under consideration both contain some real payload data and they belong to the sameburst.i.e.Intervals=[T3−T2,T5−T3,T9−T8,T12−T11,T14−T12,T15−T14]andρ∗=2(cid:98)log2Median(Intervals[])(cid:99). C. Rate Adaptation two. This further hides information about the sender’s true rate, and gives the sender room to increase it’s transmission CS-BuFLO adapts its transmission rate to match the rate rate, e.g. during slow start. of the sender. This reduces wasted bandwidth when proxying slow senders, and it reduces latency when proxying fast senders. However, adapting CS-BuFLO’s transmission rate to D. Congestion-Sensitivity match the sender’s reveals information about the sender, and There’s a trivial way to make BuFLO congestion sensitive therefore may harm security. and TCP friendly: run the protocol over TCP. With this As shown in Figure 2, CS-BuFLO takes several steps to approach, we grab an additional opportunity for increasing limit the information that is leaked through rate adaptation. efficiency: when the network is congested, CS-BuFLO does First, it only adapts after transmitting 2k bytes, for some not need to insert junk data to fill the output buffer. integer k. Thus, during a session in which CS-BuFLO trans- Algorithm 4 shows our method for taking advantage of mitsnbytes,CS-BuFLOwillperformlog nrateadjustments, 2 congestion to reduce the amount of junk data sent by CS- limiting the information leaked from these adjustments. This BuFLO.NotefirstthatCS-SENDalwayswritesexactlydbytes choice also allows CS-BuFLO to adapt more quickly during to the TCP socket. Since the amount of data presented to the beginning of a session, when the sender is likely to be the TCP socket is always the same, this algorithm reveals performing a TCP slow start. During this phase, CS-BuFLO no information about the timing or size of application-data is able to ramp up its transmission rate just as quickly as the packets from the website that have arrived at the CS-BuFLO sender can. proxy. CS-BuFLO further limits information leakage by using a This algorithm takes advantage of congestion to reduce the robust statistic to update ρ∗. Between adjustments, it collects amount of junk data it sends. To see why, imagine the TCP estimates of the sender’s instantaneous bandwidth. It then connection to the client stalls for an extended period of time. sets ρ∗ so as to match the sender’s median instantaneous Eventually, the kernel’s TCP send queue for socket s will fill bandwidth. Median is a robust statistic, meaning that the new up, and the call to write will return 0. From then until the ρ∗ value will not be strongly influenced by bandwidth bursts TCP congestion clears up, CS-BuFLO calls to CS-SEND will andlulls,andhenceρ∗ willnotrevealmuchaboutthesender’s not append any further junk data to B. transmission pattern. Note that the estimator only collects measurements during E. Stream Padding uninterrupted bursts from the sender. This ensures that the bandwidth measurements do not include delays caused by CS-BuFLO hides the total size of real data transmitted by dependencies between requests and responses. continuing to transmit extra junk data after the browser and For example, if the estimator sees a packet p from the web server have stopped transmitting. 1 website, then a packet p from the client, and then another Table II shows two related padding schemes we experi- 2 packet p from the website, it may be the case that p is mented with in CS-BuFLO. Both schemes introduce at most 3 3 a response to p . In this case, the time between p and a constant factor of additional cost, but reveal at most a 2 1 p is constrained by the round trip time, not the website’s logarithmic amount of information about the size of the 3 bandwidth. website. The first scheme, which we call payload padding, Finally, CS-BuFLO rounds all ρ∗ values up to a power of continues transmitting until the total amount of transmitted 8 Algorithm 2 The main loop of the Congestion-Sensitive Algorithm 3 Algorithm for estimating new value of ρ∗ based BuFLO server. on past network performance. function CSBUFLO-SERVER(s) function RHO-ESTIMATOR(ρ-stats, ρ∗) while true do I ←[ρ-stats −ρ-stats |ρ-stats (cid:54)=⊥∧ρ-stats (cid:54)=⊥] i+1 i i i+1 (m,ρ)= READ-MESSAGE(ρ) if I is empty list then if m is application data from website then return ρ∗ output-buff ← output-buff (cid:107) data else real-bytes ← real-bytes + LENGTH(m) return 2(cid:98)log2MEDIAN(I)(cid:99) last-site-response-time ← CURRENT-TIME end if else if m is application data from client then end function send m to the website ρ-stats ← ρ-stats (cid:107)⊥ Algorithm 4 Algorithm for sending data and using feedback onLoadEvent ← 0, padding-done ← 0 fromTCP.SocketsshouldbeconfiguredwithO_NONBLOCK. else if m is onLoad message then function CS-SEND(s, output-buff) onLoadEvent ← 1 n←LENGTH(output-buff) else if m is padding-done message then j ←0 padding-done ← 1 if n< PACKET-SIZE then else if m is a time-out then j ← PACKET-SIZE−n if output-buff is not empty then output-buff ← output-buff (cid:107) j ρ-stats ← ρ-stats (cid:107) CURRENT-TIME end if end if r ← write(s, output-buff, PACKET-SIZE) (output-buff, j) ← CS-SEND(s, output-buff) if r ≥n then (cid:46) Optional: reclaim unsent junk junk-bytes ← junk-bytes + j output-buff← empty buffer end if j ←r−n if DONE-XMITTING then else reset all variables remove last j bytes from output-buff else (cid:46) ρ∗ : Average time between sends to client remove first r bytes from output-buff if ρ∗ =∞ then j ←0 ρ∗ ← INITIAL-RHO end if else if CROSSED-THRESHOLD(real-bytes, junk- return (output-buff,j) bytes) then end function ρ∗ ← RHO-ESTIMATOR(ρ-stats,ρ∗) ρ-stats ← ∅ end if server must know when the website has finished transmit- ting. Congestion-Sensitive BuFLO uses two mechanisms to if m is a time-out then recognize that the page has finished loading. First, the CS- ρ← random number in [0,2ρ∗] BuFLOclientproxymonitorsforthebrowser’sonLoadevent. end if The CS-BuFLO client notifies the CS-BuFLO server when it end if receives the onLoad event from the browser. Once the CS- end while BuFLOserverreceivestheonLoadmessagefromtheclient,it end function considersthewebservertobeidle(seeAlgorithm5)andwill stop transmitting as soon as it adds sufficient stream padding and empties its transmit buffer. As a backup mechanism, data (R+J) is a multiple of 2(cid:100)log2R(cid:101). This padding scheme the CS-BuFLO server considers the website idle if QUIET- will transmit at most 2(cid:100)log2R(cid:101) additional bytes, so it increases TIME seconds pass without receiving new data from the the cost by at most a factor of 2, but it reveals only log R. 2 website.Weuseda QUIET-TIME of2secondsinourprototype The second scheme, which we call total padding, continues implementation. transmitting until R+J is a power of 2. This also increases thecostbyatmostafactorof2andreveals,intheworstcase, log R, but it will in practice hide more information about R F. Early Termination 2 than payload padding. Asdescribedabove,theCS-BuFLOserverislikelytofinish Note that the CS-BuFLO server and the CS-BuFLO client each page load by sending a relatively long tail of pure junk do not have to use the same stream padding scheme. Thus, packets. This tail can be a significant source of overhead there are four possible padding configurations, which we and, somewhat surprisingly, may not provide much additional denote CPSP (client payload, server payload), CPST (client security. payload, server total), CTSP (client total, server payload) and Our initial investigations revealed that the long tail served CTST (client total, server total). two purposes which could also be served through other, more Inordertodeterminewhentostoppadding,theCS-BuFLO efficient means. As mentioned above, the long tail helps hide 9 Padding Payload Sent Junk Sent Total Bytes Sent CPSP without Client transmitting Client done Schemes Before Padding Before Padding After Padding early termination Server transmitting Server padding Server done ppaadyldoiandg R J c2(cid:100)log2R(cid:101) CeaPrlSyP t ewrimthination Client tranSsmerivtteirn gtransmidonettinCglient done Server done tpoatdadling R J 2(cid:100)log2(R+J)(cid:101) CeaTrlSyP t ewrmithination Client Strearnvsemr tirttainnsgmitting doneClient doSneerver done TABLEII CTSP without Client transmitting Client done TWODIFFERENTPADDINGSCHEMESFORCS-BUFLO. early termination Server transmitting Server padding Server done Time Fig.3. Theinteractionbetweenclientandserverpaddingschemesandearly Algorithm 5 Definition of the DONE-XMITTING function. termination.Morepaddingattheclientcanhelphidethesizeofthelastobject function DONE-XMITTING sent from the server to the client. Early termination can avoid unnecessary return LENGTH(output-buff) ← 0 paddingattheendofapageload. ∧CHANNEL-IDLE(onLoadEvent,last-site-response-time)∧ (padding-done∨CROSSED-THRESHOLD(real-bytes + junk-bytes)) G. Packet Sizes end function Sending fixed-length packets hides packet size information function CHANNEL-IDLE(onLoadEvent, from the attacker. Although any fixed length should work, last-site-response-time) it is important to choose a packet length that maximizes return onLoadEvent ∨ (last-site-response-time + performance.Sincewemaytransmitpurejunkpacketsduring QUIET-TIME < CURRENT-TIME) thetransmission,largerpacketstendtocausehigherbandwidth end function overhead,andontheotherhand,smallerpacketsmaynotmake fulluseofthelinkbetweentheclientandserver,thusincrease function CROSSED-THRESHOLD(x) the loading time. return (cid:98)log (x−PACKET-SIZE)(cid:99)<(cid:98)log x(cid:99) Preliminary investigations revealed that over 95.7% of all 2 2 end function upstream packet transmissions are under 600 bytes, therefore, this was used as the standard packet size in our experiments. the total size of the website. However, the interior padding VI. PROTOTYPEIMPLEMENTATION performed by CS-SEND also obscures the total size of the We modified OpenSSH5.9p1 to implement Algorithm 2. website.OurevaluationinSectionVIIinvestigatesthesecurity However, the optional junk recovery algorithm described in impact of additional stream padding. Algorithm 4 was not implemented. In the specific context of web browsing, the long tail also The SSH client was also modified to accept a new SOCKS hidesthesizeofthelastobjectsentfromthewebservertothe proxy command code, onLoadCmd. This command was used client. The attacker can infer some information about the size to communicate to the server when to stop padding (as ofthisobjectbymeasuringtheamountofdatatheCS-BuFLO described in Section V-E). A Firefox plugin, OnloadNotify, server sends to the CS-BuFLO client after the CS-BuFLO that, upon detecting the page onLoad event, connects to the client stops transmitting to the CS-BuFLO server. However, SSHclient’sSOCKSportandissuestheonLoadCmd,wasalso this information can also be hidden by having the CS-BuFLO developed. client continue to send junk packets to the CS-BuFLO server, In addition, the following OpenSSH message types were i.e.moreaggressivestreampaddingfromtheCS-BuFLOclient used: mayobviatetheneedforaggressivepaddingattheCS-BuFLO 1) The OpenSSH message type SSH_MSG_IGNORE, server. which means all payload in a packet of this type can be Basedontheseideas,weimplementedanearlytermination ignored, was used to insert junk data whenever needed. feature in our CS-BuFLO prototype. The CS-BuFLO client 2) The SSH_MSG_NOTIFY_ONLOAD message was cre- notifies the CS-BuFLO server that it is done padding. After ated to be used by the client to communicate reception receiving this message, the CS-BuFLO server will stop trans- of onLoadCmd from the browser, to the server. Upon mittingassoonasthewebserverbecomesidleanditsbuffers receiving this message from the client, the CS-BuFLO are empty. server stops transmitting as soon as it empties its buffer Figure 3 illustrates how the padding scheme used by the and adds sufficient stream padding. client and server can interact, including the impact of early 3) The SSH_MSG_NOTIFY_PADDINGDONE message termination.Additionalclientpaddingcanhidethesizeofthe was created to implement the early termination feature lastHTTPobject,andearlyterminationcanavoidunnecessary of CS-BuFLO. Upon receiving this message from the padding. Our evaluation investigates the security/efficiency client, the CS-BuFLO server stops transmitting as soon trade-offs between different padding regimes at the client and as the web server becomes idle and its buffers are server, and how they interact with early termination. empty. 10 All the above messages were buffered and transmitted just Padding Early Bandwidth Latency VNG++ like other messages in Algorithm 2, i.e. using CS-SEND, Termination Ratio Ratio Accuracy thereforeanattackerisunabledistinguishthesemessagesfrom CTSP Yes 3.59 3.91 29.0% other traffic. CTSP No 3.73 3.51 29.6% VII. EVALUATION CPSP Yes 2.60 2.87 34.2% We investigated several questions during our evaluation: CPSP No 3.42 3.52 36.0% • How do the different stream padding schemes affect TABLEIII performance and security of CS-BuFLO? What is the SECURITYANDPERFORMANCEOFCONGESTION-SENSITIVEBUFLO VARIANTS.VNG++SUCCESSRATEISTHEPROBABILITYTHATTHE effect of adding early termination to the protocol? ATTACKWASABLETOCORRECTLYGUESSWHICHOF50WEBPAGESTHE • How does CS-BuFLO’s security and overhead compare USERWASVISITING. toTor’s,andhowdotheybothcomparetothetheoretical minimums derived in Section IV? • Can we use the theoretical lower bounds to enable us to disabled during data collection, except when collecting CS- compare defenses that have different security/overhead BuFLO traffic, where we enabled the OnloadNotify plugin. trade-offs? Three of the computers had 2.8GHz Intel Pentium CPUs and 2GB of RAM, one computer had a 2.4GHz Intel Core 2 Duo A. Experimental Setup CPU with 2GB of RAM. We scripted Firefox using Ruby and For our main experiments, we collected traffic from the captured packets using tshark, the command-line version of Alexa top 200 functioning, non-redirecting web pages using wireshark.FortheSSHexperiments,weusedOpenSSH5.3p1. four different defenses: plain SSH, Tor, CS-BuFLO with the Our Tor clients used the default configuration. SSH tunnels CTSP padding and early termination, and CS-BuFLO with passed between two machines on the same local network. CPSPpaddingandearlytermination.Wealsocollectedseveral Wemeasuredthesecurityofeachdefensebyusingthethree smaller data sets using other configurations of CS-BuFLO, best traffic analysis attacks in the literature: VNG++ [5], the but these are only used in the padding scheme evaluations Panchenko SVM [14], and DLSVM [3]. We ran each of the (Table III). above classifiers against the traces generated by each defense WeconstructedalistoftheAlexatop200functioning,non- using stratified 10-fold cross validation. redirecting, unique pages, as follows. We removed web pages thatfailedtoloadinFirefox(withoutTororanyotherproxy). B. Results We replaced URLs that redirected the browser to another URLwiththeirredirecttarget.Somewebsitesdisplaydifferent Padding Schemes: Table III shows the bandwidth ratio, languagesandcontentsdependingonwherethepageisloaded, latencyratio,andsecurity(estimatedusingtheVNG++attack) e.g. www.google.com and www.google.de. We kept only one of four different versions of CS-BuFLO on a data set of 50 URLforthistypeofwebsite,i.e.weonlyhadwww.google.com websites.Notethatearlyterminationdoesnotappeartoaffect inourset.OurdatasetconsistedofAlexa’s200highest-ranked security,althoughitcansignificantlyreduceoverheadinsome pages that met these criteria. configurations. All other experiments in this paper use early We collected 20 traces of each URL, clearing the browser termination. The client padding scheme, on the other hand, cache between each page load. We collected traces from each appears to control a trade-off between security and overhead. web page in a round-robin fashion. As a result, each load of Therefore we report the rest of our results for both CPSP and the same URL occurred about 5 hours apart. CTSP padding. Measuring the precise latency of a fingerprinting defense Security Comparison: Figure 4 shows the level of secu- scheme poses a challenge: we can easily measure the time it rity various defense schemes provide against three different takes to load a page using the defense, but we cannot infer attacks, as the number of web pages the attacker needs the exact time it would have taken to load the page without to distinguish increases. Note that the CS-BuFLO schemes the defense. Therefore, every time we loaded a page using have significantly better security than Tor and SSH. For each a defense, we immediately loaded it again using SSH to get defensescheme,wecomputeitsaveragebandwidthratio,BO, an estimate of the time it would have taken to load the page and plot the lower bound on security that can be achieved without the defense in place. We then compute latency ratios within that ratio, using the algorithm from Section IV. the same way we compute bandwidth ratios, i.e. if L(t) is the Bandwidth Cost: Figure 5 plots the bandwidth ratios of total duration of a packet trace, the latency ratio of a defense SSH,Tor,andCS-BuFLOwithCTSPandCPSPpadding.SSH scheme is E(cid:2)L(TD)(cid:3) has almost no overhead, and Tor’s overhead is about 25% on W average. CS-BuFLO with CPSP has an average overhead of E[L(T )] W 129%, CTSP has average overhead 180%. Thus CS-BuFLO’s Wecollectednetworktrafficusingseveraldifferentcomput- improved security does come at a price. ers with slightly different versions of Ubuntu Linux – ranging Theoretical Bounds: Figure 6 evaluates CS-BuFLO, Tor, from 9.10 to 11.10. We used Firefox 3.6.23-3.6.24 and Tor SSH, and BuFLO against the theoretical lower bounds devel- 0.2.1.30 with polipo HTTP Proxy. All Firefox plugins were oped in Section IV.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.