ebook img

Summoning Demons: The Pursuit of Exploitable Bugs in Machine Learning PDF

0.9 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Summoning Demons: The Pursuit of Exploitable Bugs in Machine Learning

Summoning Demons The Pursuit of Exploitable Bugs in Machine Learning Rock Stevens Octavian Suciu Andrew Ruef Sanghyun Hong Michael Hicks Tudor Dumitras , UniversityofMaryland,CollegePark ABSTRACT threats[25]. Machinelearningallowstheseorganizationsto 7 Governments and businesses increasingly rely on data ana- extrapolatetrendsfrommassivedatasets,ofoftenuncertain 1 lytics and machine learning (ML) for improving their com- provenance. 0 petitive edge in areas such as consumer satisfaction, threat 2 intelligence,decisionmaking,andproductefficiency. Howe- However, ingesting unfiltered, public information into data n ver, by cleverly corrupting a subset of data used as input analytic engines also introduces a threat, as miscreants can a to a target’s ML algorithms, an adversary can perturb out- corrupt eventual inputs to ML algorithms to bias their ou- J comes and compromise the effectiveness of ML technology. tputs. Cretu et al. [7] discussed the importance of “cas- 7 Whilepriorworkinthefieldofadversarialmachinelearning tingoutdemons,”orsanitizingthetrainingdatasetsforsafe 1 hasstudiedtheimpactofinputmanipulationoncorrectML machine learning ingestion. Research on adversarial ma- algorithms, we consider the exploitation of bugs in ML im- chine learning [16,19,1,3,13]hasexploredvariousattacks ] plementations. In this paper, we characterize the attack against ML algorithms, with a focus on skewing their ou- R surfaceofMLprograms,andweshowthatmaliciousinputs tputs through malicious perturbations to the input data. C exploitingimplementationbugsenablestrictlymorepower- . ful attacks than the classic adversarial machine learning te- In this paper, we discuss another attack vector: ML algori- s c chniques. We propose a semi-automated technique, called thm implementations. Like all software, ML algorithm im- [ steered fuzzing, for exploring this attack surface and for di- plementationshavebugsandsomeofthesebugscouldaffect scovering exploitable bugs in machine learning programs, learningtasks. Thus,attackscanconstructmaliciousinputs 1 in order to demonstrate the magnitude of this threat. As to ML algorithm implementations that exploit these bugs. v a result of our work, we responsibly disclosed five vulne- Indeed, such attacks can be more powerful than traditional 9 rabilities, established three new CVE-IDs, and illuminated adversarialmachinelearningtechniques. Forexample,ame- 3 a common insecure practice across many machine learning mory corruption vulnerability could allow an adversary to 7 systems. Finally, we outline several research directions for corrupt the entire feature matrix, not just the entries that 4 further understanding and mitigating this threat. correspond to adversary-controlled inputs. More generally, 0 bugsinthecostfunction,minimizationalgorithm,modelre- . 1 presentation, prediction or clustering steps, could allow an Keywords 0 adversarytoarbitrarilyskewlearningoutcomesortoinitiate 7 Machine learning, vulnerability research, application secu- a denial of service attack. 1 rity, vulnerability exploitation, fuzzing : v Whileconsiderableeffortshavebeendevotedtodiscovering i 1. INTRODUCTION software vulnerabilities and mitigating the impact of explo- X Governments and businesses increasingly employ data ana- its, these generally focus on bugs that allow the adversary r lytics to improve their competitive edge. For example, the to subvert the targeted system, e.g. by executing arbitrary a United States Environmental Protection Agency has ou- codeorbyachievingprivilegeescalation. Incontrast,adver- tlineditsvisionforleveragingmachinelearning(ML)toim- saries attacking an ML system are interested in bugs that prove their everyday operations [12]. IBM offers businesses allow them to induce mispredictions, misclustering, or to a platform for conducting sentiment analysis to gauge their suppress outputs. Such logic bugs are difficult to discover effectiveness within a target audience [15]. OpenDNS uses using existing tools. ML to automate protection against known and emerging As a first step toward understanding and mitigating this threat, we characterize the attack surface of ML programs, whichderivesfromageneralarchitecturethatmanyMLal- gorithms share, and we identify decision points whose out- come we may corrupt. We discuss how bugs around those decision points could be exploited and the potential outco- mes of these exploits. We also propose a semi-automated technique called steered fuzzing for finding and exploiting ML implementation bugs. We wrap important decision po- 1 ints from the ML architecture with instrumented code to exploitcouldenablecodeexecution capabilitiesfortheatta- convert a logical failure of the algorithm (e.g. mispredic- cker. Thiscategoryofexploitsisarguablythemostpowerful tion)intoacrashthatcanbedetectedbyafuzzingtool[20], since it gives the adversary full control over the underlying which generates test cases and records program exceptions machine. on these inputs. We then apply a coverage-based fuzzing tool, American Fuzzy Lop [36], to summon demons, i.e. to In this paper, we address the problem of discovering explo- automatically discover inputs that mislead the ML algori- itable vulnerabilities in machine learning implementations. thms by exploiting bugs in their implementation. Thegoalsofourworkare: (i)toprovideageneralMLarchi- tectural description, discussing possible attack vectors and We utilize this technique to discover attacks against theirimpactondifferentsystemcomponents;(ii)todevelop OpenCV [4] and Malheur [31], two open source ML imple- asemi-automatedtechniquefordiscoveringMLvulnerabili- mentations. As an example, we started fuzzing with a seed tiesbyexploringthisattacksurface;and(iii)todemonstrate image that was recognized as having a face by OpenCV. the magnitude of this threat by discussing several real vul- We added a logic branch that crashed on non-recognition. nerabilities we discovered in popular ML systems. Steered fuzzing then proceeded to generate a mutant input that was clearly still a face, and yet was not recognized. Non-goals: Wedonotaddresslimitationsofmachinelear- Thisexploitreliesonabugintherenderinglibraryusedby ning algorithms (the area of study in adversarial machine OpenCV,whichallowsforincorrectrenderingofinputima- learning). Instead, we aim to unearth implementation bugs ges. Intotal,wefoundsevenbugs: threeinOpenCV,twoin as an orthogonal attack vector against ML systems. Addi- Malheur, one in Scikit-Learn, and one in libarchive (used tionally,wedonotaimtodevelopafullyautomatedtechni- byMalheur). Ofthese,threewereassignedaCVE-ID;only queforidentifyingthesebugs. Instead,byrelyingonsteered one was not exploitable. codeinstrumentationandprogramoutputmanipulation,we are able to bootstrap existing fuzzing tools in order to di- In summary, this paper makes three contributions. First, scover bugs. we explore the attack surface of ML implementations, as compared to ML algorithms, highlighting potential attack 2.1 ThreatModel vectorsandimpactonvariouscomponentswithinthesesys- tems. Second,weintroduceanoveltechniqueforexploiting Weconsideranadversarywhoaimstosubverttheexecution ML bugs to corrupt classification outcomes and the data of machine learning algorithms by exploiting bugs in algo- provided to ML systems from benign sources. This techni- rithm implementations. We assume that the adversary has queispossiblethroughsteered fuzzing,whichexpandsupon access to the program’s source code. We also assume that existing fuzzing techniques for discovering bugs in software the adversary controls some of the program’s inputs, but applications. Finally,wediscoverseveralnewMLimplemen- is unable to prevent benign users from providing additional tationbugsinimportantopen-sourcesoftware;ourworkhas inputs. These assumptions are realistic in many settings; led to these bugs being patched. for example, the machine learning techniques proposed for malware classification or clustering [33, 28, 10, 31] operate on inputs that come from many sources, including possible 2. PROBLEMSTATEMENT adversaries. Additionally,muchMLsoftwareisopensource, We consider an exploit to be a piece of code aiming to su- e.g., OpenCV [4] and Scikit-Learn [27]. bvert the intended functionality of software. Limiting our scope to machine learning, an exploit would be designed In searching for exploitable bugs, the adversary does not with the goal of corrupting the outputs of programs or to pursue the usual goals of vulnerability exploitation, e.g., inhibit their operation. Such exploitable bugs may be pre- gaining control over remote hosts, achieving privilege es- sent either in the core implementation of the ML algorithm calation, escaping sandboxes, etc. Instead, the adversary’s orinlibrariesusedforfeatureextractionormodelrepresen- goalistocorrupttheoutputsofmachinelearningprograms tation. using silent failures. In terms of impact, we distinguish between three possible Fromaspectrum-of-controlperspective,arbitrarycodeexe- outcomesofsuccessfulexploits. First,anexploitthatcauses cution exploits represent the strongest means for achieving specific instances to be assigned an incorrect label achieves theadversary’sgoal,assuchanexploitpermitsanadversary mispredictions. Specifically,anexploittargetingthetraining tomanipulateallaspectsofthetargetsystem. However,the phaseresultsinpoisoning oftheclassifier,whileoneduring adversary may achieve her goals with less powerful logical testing could allow evasion. Similarly, an exploit targeting exploits, e.g., targeting memory corruption bugs that allow a clustering algorithm may cause inputs to be placed in di- modifying data in memory but do not enable code execu- fferentclusters,resultinginmisclustering. Becausemachine tion or bugs that trigger loss of precision in floating point learningsystemsareoftenutilizedasblackboxes,itmaybe computations. Denial of service attacks could also prove difficulttodetectthatthesystemhasbeencompromisedby beneficialfortheattackerandmay,forexample,beconduc- using one of these exploits, as they typically have no other ted by inducing early termination of the ML processing. In sideeffectsbesidesskewingthelearnedmodelandcausethe some settings, the weaker attacks may be more attractive MLsystemtofailsilently. Anexploitmayalsoresultinde- as they could allow the adversary to remain stealthy and nial of service, e.g. by stopping data ingestion prematurely bypass defense mechanisms. or by crashing the application to prevent it from providing any output. While easier to detect, such exploits may ren- der the system temporarily unusable. Finally, a successful 3. ATTACKINGMLIMPLEMENTATIONS 2 Figure 1: General architecture of ML systems. To begin exploring the vulnerabilities of machine learning unlabeled samples are transformed using the same feature applications, we must first understand their attack surface. extractionandnormalizationprocesses. Thepredictedclass Enumerating the components of ML applications that an labels are obtained using the prediction function over the attacker may target allows us to reason about where the trained model. In clustering, the algorithm first performs bugsmaybeandwhatimpacttheymayhave. Wethenbu- feature extraction and normalization. Using a distance me- ildonthisunderstandingandexpanduponexistingfuzzing tric,thealgorithmgroupsthesamplesintoclustersthatre- techniques to create exploits for these bugs to induce mis- flect the similarity between them. classifications (false negatives or false positives), incorrect clustering results, and denial of service. 3.2 FromArchitecturetoAttackSurface 3.1 MachineLearningArchitecture WenowdiscusshowattacksoneachcomponentinFigure1 may impact the overall functioning of the system. Table 1 Machine learning algorithms vary in structure and design. summarizes the vectors and impact of attacks against the Thetwomaincategoriesoflearningalgorithmsaresupervi- system components. A successful attack against one com- sedandunsupervised. Insupervisedlearning,thealgorithm ponent may have ripple effects to others, either directly by receivesasetoflabeledexamplesandcomputesapredictive transferring corrupted outputs to inputs, or indirectly via modelthatfitsthetrainingexamples. Thepredictivemodel in-memory data structure corruption. isthenusedtoclassifynew,unlabeledsamples. Incontrast, unsupervised learning relies solely on unlabeled examples Featureextraction. Featureextractionisthebackbonefor with the goal of finding clusters of similar samples. While the integrity of the system. Every attack from an external there may not be a generic representation that fits all al- source must exploit vulnerabilities in this component as it gorithms, some of the most popular supervised techniques is the sole communication port between the internal com- are variations of iterative minimization algorithms. For un- ponentsandtheexternalenvironment. Anattacktargeting supervised learning, clustering is one of the most prevalent the feature extraction component results in a corruption of classes of algorithms. Figure 1 presents the general flow of the information passed downstream. a learning algorithm, highlighting the key particularities of each phase. In this setting, the input samples are transfor- Within the feature extraction component itself, an attack med into a feature matrix representation that serves as the cantargettheinputparsingalgorithmsand/ortheintegrity input to the classifier. A common, but optional, practice is checksperformedonthefeaturerepresentation. Asshownin to normalize the features prior to feeding them to the al- Section4,suchanexploitcouldresultinmispredictions,mis- gorithm. This involves feature scaling and standardization. clustering,arbitrarycodeexecutionorDoS.Itisnotalways In the training phase, the (normalized) features are applied straightforward to define what is allowable input, or an al- ontothecurrentmodelinordertoobtaintheperceivedpre- lowable representation thereof. For example, in an image diction. The predictions are compared to the actual class classification setting, a reasonable assumption would be to labelsusingacostfunction. Thecostfunctionoutputquan- consider any renderable image as legitimate inputs. Howe- tifiesthedistancebetweenthecurrentmodelandtheground ver, as detailed in Section 4.1, we found that most images truth. Themodelisthenupdatedtoreducethecostthrough thatcausecrashesintheOpenCVlibraryareactuallyvalid aminimizationalgorithm. Thisiterativeprocessisrepeated from a rendering perspective. until the model becomes a sufficiently accurate representa- tion of the ground truth. Upon convergence, the model is Prediction. Attacksarealsopossibleagainsttheprediction used to predict new class labels. In the testing phase, the component, directly influencing the labels predicted by the 3 Component Exploitation Techniques Impact Poisoning / Evasion / Misclustering, Feature Extraction Insufficient integrity checks Code execution, DoS Prediction Overflow / Underflow, NaN, Loss of Precision Poisoning / Evasion Cost Function Overflow / Underflow, NaN, Loss of Precision Poisoning, DoS Minimization Algorithm Overflow / Underflow, NaN, Loss of Precision Poisoning, DoS Model Representation Loss of Precision Poisoning / Evasion Clustering Overflow / Underflow, NaN, Loss of Precision Misclustering Table 1: Attack surface of ML algorithms. algorithm. This could occur in both the training and the Our use case is, in one sense, a natural fit for general pur- testing stages. For example, the attack could exploit bugs posefuzzingbecausewecanhaveasingleprogramthatruns related to floating point overflow, floating point underflow, on some input (i.e. an image) to produce some output (i.e. or the use of not-a-number (NaN) values. ML implementa- a text classification of that image). However, we have to tions typically compute logarithms and square roots. This ensurethatweseparateandidentifybothbugtypesofinte- makesthemparticularlysusceptibletobugscausedbyNaNs rest: crashesandsilentcorruption. Todothisweintroduce (potentially the result of an overflow or insufficient consis- a technique we call steered fuzzing. tency checks). The effects of a NaN propagate throughout the remainder of the computation and can result in poiso- We use American Fuzzy Lop (AFL) [36] to instrument and ning or classifier evasion. fuzz-testmachinelearningprograms. AFLwasdesignedand iscommonlyusedforfindingcrashesduetoparsingfailures, Cost function and minimization algorithm. The cost sotheAFLloopinvolvesrunninganapplicationonmultiple function computation and the minimization algorithm are inputsandcreatingareportifaninputcausesacrash. AFL iteratively applied in the training phase. A bug could re- utilizes a genetic algorithm to generate inputs while maxi- sultinincorrectcostestimatesormodelupdatesthatcause mizingthe code coverage and has heuristics to discriminate the decision boundary to be shifted away from the optimal between unique crashes and duplicates. We want to capita- value. Additionally, a denial of service could be obtained if lize on AFL’s ability to maximize code coverage while also themodelupdatedoesnottriggertheterminationcondition finding crashing inputs. in the iterative algorithm. If the cost function consistently results in a NaN for the training examples, the minimiza- A steered fuzzing workflow begins with a test case with a tion algorithm stagnates indefinitely without updating the known outcome; for example, when analyzing OpenCV, we model. start with an image that contains a human face. The three outputs from the program under test might be: crash, ne- Model representation. The model representation could gative prediction, (e.g. no face found) or positive prediction cause poisoning or evasion through loss of precision. Since (e.g. face found). The default behavior with the initial test the training and the testing phases of algorithms are typi- case is to find a face. Our fuzzing should mutate the image cally performed separately, the model has to be stored and to change the output of the program under test to negative transferred from the one to the other. As discussed in Sec- prediction while avoiding crash. tion 4, casting between float and long types can skew the model away, resulting in inaccurate predictions for the un- When we are searching for such logical failures, we do not labeled samples. care about inputs that produce crashes when OpenCV at- Clustering. In clustering, the algorithm itself or the dis- tempts to parse the image (although there is a disturbingly large number of these inputs). The first part of our ste- tancemetriccanbemanipulatedusingthesameattackvec- ered fuzzing technique brackets the parsing regions of the tors as for supervised learning. This could result in a de- nial of service or misclustering. In complete misclustering, program in a handler for the SIGSEV signal. The handler theclustersarecompletelymisrepresented,whileinselective simply exit’s the program when a segmentation violation occurs. This prevents the crash and obscures it from AFL, misclusteringtheattackmightresultinaparticularsample which then believes that the application exited normally. being placed in a different cluster. 3.3 DiscoveryMethods We then re-enable crashes in the application and check the outcome of important decision points in the ML algorithm. Fuzzing [20] is a popular method for bug discovery. A fuz- For example, we check the result of the face detection step, zing tool tests a program using randomly generated inputs, which corresponds to the outcome of the prediction phase which are often invalid or unexpected by the implementa- fromFigure1. Ifthesystemfailedtofindaface,weinduce tion,andrecordsprogramexceptionsorfailures. Insecurity, a crash by manually dereferencing an invalid pointer. In fuzzinghasbeenemployedtoidentifycrashesthatareindi- thisway,AFLrecognizeswhenithaschangedtheoutputof cativeofmemorysafetyerrorsinapplication. Thistechnique the program to no face found without any change to AFL has obvious applications to discovering one class of bug in itself. Similarly, we can instrument the outputs of each of machinelearningsystems—crashes—butcanweusefuzzing the components described in Section 3.2, to check for the to find bugs that silently corrupt the system’s outputs? In presence of exploitable ML bugs. this section, we use OpenCV as a running example while describing our bug discovery methodology. 4 4. RESULTS tionattacksalreadyexistagainstvideomonitoringsoftware WesearchforexploitableMLbugsintheOpenCV[4]com- that law enforcement organizations use to read license pla- putervisionlibraryandintheMalheur[31]libraryforana- tes and issuing fines [24]; one can understand the ramifica- lyzing and clustering malware behavior. We select these tions of a DoS attack against similar applications or even libraries because they are open source and they are widely autonomous driving vehicles using similar computer vision adopted. software. OpenCV provides its users with a common framework for A potentially viable defense against these crash-inducing computer vision applications and can process still imagery, images starts with first filtering input based on a render- live streaming video, and previously recorded video clips. check using the Python Image Library (PIL) [6] and the For example, businesses can use computer vision and ma- code snippet in Listing 1: chinelearningtoreinforcephysicalsecuritysystems[32]. In such a scenario, an adversary may wish to thwart physical securitythroughattackingthemachinelearningapplication Listing 1: Image render-check using PIL itself. from PIL import Image Malheurisasecuritytoolthatperformsanalysisonmalware def is image ok(filename): reportsthatwererecordedwithinsandboxedenvironments. try: Malheurcanclusterthereportstodeterminewhichsamples Image.open(filename).load() likely belong to the same malware family; these malware return True reports can be raw text files or compressed file archives. except: Malheur relies on libarchive to extract the malware re- return False ports from the file archives. An adversary that desires to delayanalysisoftheirmalwaremaytargetMalheurthrough Of the 3197 images we found that induce crashes in craftedfilearchivesandcorruptin-memorydata. Datacor- OpenCV, PIL only allows 7 images to bypass this filter, ruption will cause misclustering and allow the adversary to resulting in a 0.0022% false negative rate. Yahoo! Flickr’s accomplish their goal. proprietaryimagerenderingsolutionallows6crash-inducing images through. These crash-inducing images are publicly Asaresultofourresearch,weresponsiblydisclosedfivevul- available for viewing.1 nerabilities (to which three were assigned CVE-IDs). The libarchive and Malheur system maintainers patched two of the vulnerabilities; as of January 27, 2016, the OpenCV Malheur. We discovered a critical bug within libarchive maintainers acknowledged three vulnerabilities and would as used by Malheur. The vulnerability was issued CVE- address the issues in future releases. These vulnerabilities 2016-1541[9]andwaspatchedinlibarchive 3.2.0onMay still exist in the current version of OpenCV. Table 2 sum- 1, 2016. This vulnerability affected every version of Linux marizes the vulnerabilities we found and their impact. and OS X, given that libarchive is pre-packaged in these operating systems for handling various file archives. A suc- 4.1 DiscoveryResults cessful exploit would allow an attacker to achieve arbitrary code execution by exploiting the inconsistent handling of OpenCV. We discovered bugs in OpenCV’s image pro- .tar.gz compressed archives. This vulnerability occurs wi- cessing library, and we identified various conditions under thinthefeatureextraction blockwithinFigure1,giventhat which a valid JPG would cause an algorithm to terminate. thefunctioninherentlyreliesuponlibarchiveforattaining Two vulnerabilities (CVE-2016-1516 and CVE-2016-1517) data. Once an attacker achieves arbitrary code execution, existinthefeature extraction / selection portionoftheML they have unlimited influence over the classification of the attack surface in Figure 1 and cause memory corruption MLapplication. ThisbugcouldtriggeranotherbuginMa- when freeing a matrix allocated for image processing. In lheur’sfeaturematrixextraction/selectionandwaspatched both CVEs, heap corruptions overflow fields in the matrix on March 6, 2016 [30]. A third vulnerability in Malheur objectandallowillegalaccesstomemorylocationswhenma- results in loss of precision when casting a double value to trix objects are deallocated. Many examples exist in which float. Section 4.2 explores the impact of corrupting the fea- anadversarycanexploitsimilarvulnerabilitiesinimagepro- ture matrix in greater depth. cessing code and achieve remote code execution on a vic- tim’s system [14, 8, 21]. Our third vulnerability exists in OpenCV’s custom image rendering library. Its improper Scikit-Learn and NumPy. We also discovered a loss of handling of file artifacts and partial rendering of particular precision vulnerability in the popular SkLearn Python fra- JPG images prevent consistent image classification. mework and the underlying NumPy library that, when ex- ploited,resultsinmispredictions. DuetothefactthatAFL During the steered fuzzing phase, these vulnerabilities and is not compatible with Python, we manually inspected the inconsistenciesservedasthebasisforcraftinglegitimatein- source code during bug discovery. put images that evade facial recognition detection. When used as-is, these images induce denial-of-service (DoS) cra- shes against OpenCV. DoS crashes in such an application 4.2 SteeredFuzzingResults require an administrator or operator to manually intervene tobringthesystembackonline. Proof-of-conceptSQLinjec- 1https://www.flickr.com/gp/138669175@N07/L53K8e 5 Vulnerability Application CVE-ID Exploited Impact Heap Corruption in Fea- OpenCV CVE-2016-1516 (cid:88) Arbitrary code execution via double free ture Extraction Heap Corruption in Fea- OpenCV CVE-2016-1517 (cid:88) Denialofserviceattackviacorrupt chunks ture Extraction and segfault Inconsistentrenderingin OpenCV n/a (cid:88) PartialrenderingofJPGfilesresultsineva- Feature Extraction sion Heap Corruption in Fea- Malheur CVE-2016-1541 (cid:88) Arbitrary code executionon all Linux and ture Extraction OS X systems via corrupted archive Heap Corruption in Fea- Malheur GitHub patch (cid:88) Memorycorruptionviaunsafeboundsche- ture Extraction cking results in misclustering Loss of Precision in Fea- Malheur n/a Loss of precision results in misclustering ture Extraction Loss of Precision in Mo- Scikit-Learn n/a (cid:88) Loss of precision results in mispredictions del Representation Table 2: Summary of ML Hunter findings. OpenCV. An attacker can exploit OpenCV’s inconsistent rendering of images to induce silent failures and thwart fa- cial detection within the prediction block of the ML attack surfaceinFigure1. Tobegin,AFLutilizesaseedimagewith a shoulder-up picture of a person. The source code snippet that performs facial recognition2 is a prime candidate for injecting the the logic branch (Listing 2) which allows us to induce a crash when the picture of the face is no longer detected. Listing 2: Logic branch injection for facial detection Figure 2: OpenCV incorrectly rendering a picture. if(faces. size() == 0) { ∗((int ∗) 0xdeadbea7) = 0xdeadbeef; sedondegreeofinducedskew. Thevulnerablelineofcode3 } uses the variable j which is dependent on user-provided in- put. Thus,steeredfuzzingcancraftacorruptedarchivefile to traverse the heap and stomp over existing values in the After 10.1 million permutations of the seed image, AFL feature matrix as shown in Listing 3. crafted Figure 2. This image is incorrectly rendered by OpenCV, as seen on the left, but is clearly renderable by Google Photos, as seen on the right. In five out of five tri- Listing 3: Example of directed heap corruption als, this method successfully recreated photos that exercise thisrenderingbug. Astheseimagesarecorrectlyformatted if (((unsigned long)&t[j −1] > JPEG files, they bypass the PIL render-check described in (unsigned long)&fv−>val[0]) && Listing1. Incontrasttoexistingtechniquesforcraftingad- ((unsigned long)&t[j−1] < versarial samples that evade detection [34, 3, 2, 19, 1], our (unsigned long) attack does not depend on the learned model and succeeds ((unsigned long)&fv−>val[0]+ from the first attempt. This represents a new attack vector (unsigned long)fv−>mem))){ against machine learning, illustratinghow bugs inML code ∗((int ∗) 0xdeadbea7) = can provide a substantial advantage to the attacker. 0xdeadbeef; } Malheur. Building on Malheur’s inability to handle cor- ruptedarchivefiles,discussedpreviously,thesteeredfuzzing As this is a heap corruption vulnerability, we performed technique can corrupt Malheur’s feature matrix and induce our proof-of-concept (PoC) exploit with address space la- silent failures in prediction results. Thus, this vulnerabi- yout randomization turned off. An attacker can couple our lity impacts all aspects of clustering within the generalized PoC exploit with ASLR bypass [11] techniques using ano- attack surface in Figure 1; an attacker can corrupt the in- therinformationdisclosureexploittofindthedesiredoffset. memory data for unlabeled samples, tamper with the in- Additionally,ourexploitusesafilearchive;theexploitation memoryfeaturematrix,andaffecttheclusteringresultsba- successvariesamongoperatingsystemsandarchitecturesas expected. 2https://github.com/opencv/opencv/blob/a8e5d1d9fdae18 3762d4e06a7e25473d10ef1974/samples/cpp/facedetect.cpp 3https://github.com/rieck/malheur/blob/75ffd2498e964aa #L202 7d09782bf5a0d31afde36585f/src/fvec.c#L382! 6 Expanding upon this example, an adversary can inject ad- inthetrainingdataset;thisresultsinmispredictions attes- ditional logic branches to control the degree in which the ting. InabsenceofafuzzingtoolforPython,wediscovered corrupted file impacts the feature matrix. Given enough thisbugbymanuallyinspectingtheScikitsourcecode,stee- time, AFL can generate inputs that increasingly skew the redbytheattacksurfaceguidelines. Thesebugsillustratea clustering of benign files the adversary did not craft. thirdattackvectorthatpotentiallyenablesnewadversarial capabilities. This represents a second attack vector that provides new capabilities for adversaries. Unlike prior attacks proposed in the adversarial machine learning literature, this attack Disclosure experience.. Three of the vulnerabilities we introduces the ability to manipulate the in-memory repre- discovered were assigned new CVE-IDs (CVE-2016-1516, sentation of inputs not provided by the adversary. From an CVE-2016-1517,andCVE-2016-1541),astheyenabledarbi- adversarialperspective,thisattackprovidestheopportunity trarycodeexecutionordenialofserviceattacks. Disclosing tomisclusterbenignsamples,orothermalicioussamples,to these vulnerabilities allowed us to draw an interesting com- obfuscate the attacker’s own malicious sample. An adver- parisonbetweenthecommunity’sreactiontothesebugsand sarycanachievethisbyinducingfalsenegatives(moresteal- the ML-specific bugs we introduce in this paper. In parti- thyanddesired)orfalsepositives(junkreports). Again,this cular, the bugs that led to misprediction, misclustering, or attackrequiresonlyonemalicioussampleandsucceedsfrom modeldivergence—includingaMalheurmemorycorruption thefirstattempt,owingtothebug. Thelibarchive 3.2.0 bugthatallowstheadversarytocontrolthefeaturematrix, andMalheurGitHubpatchrenderedthisbugunexploitable, but not to inject arbitrary code—did not receive CVE nu- as corrupted archives are rejected on ingest. mbers. Many of these bugs were labeled WONTFIX. This emphasizesthefactthatMLbugsarecurrentlyamisunder- We also discovered a bug in Malheur that we could not ex- stood threat. ploit. During feature normalization in Malheur, both func- tions4 fvec_norm1() and fvec_norm2() return a value of 5. RELATEDWORK type double but it is then normalized to a float. Using This section presents prior work on fuzzing and adversarial steered fuzzing, an attacker can discover instances where machinelearning. Adversarialmachinelearningresearchfo- type casting from double to float yields a discrepancy. cuses on crafting adversarial samples. The key distinction inourworkisthatweexploitbugsinmachinelearningcode Listing 4: Example of discovering precision loss that give the adversary an advantage in conducting these attacks. if (abs((double) (f−>val[i] / s) − (float) (f−>val[i] / s)) > epsilon){ Insufficient input sanitization is a common cause of explo- ∗((int ∗) 0xdeadbea7) = 0xdeadbeef; itable bugs [17]. Fuzzing is an automated technique that } allows developers to test how gracefully their application handle various valid and invalid input [20, 23]. Fuzzers as- sistdeveloperswithisolatingpotentiallybuggycodeandcan The epsilon in Listing 4 represents the loss of precision play a critical role in identifying locations in need of input an adversary wishes to induce. In this particular instance, sanitization. steered fuzzing did not discover any value of epsilon that caused misclustering. While our attack was unsuccessful, Thefieldofadversarialmachinelearninghasdevelopedseve- this approach could be a viable attack vector elsewhere. ralmethodsforattackingMLsystems,typicallybyquerying MLmodels. Barrenoetal. [1]proposedageneralclassifica- We discovered that such a vulnerability is present within tionsystemfortheseattacks. Integrityattacksallowhostile the Scikit-Learn machine learning library for Python and input into a system and availability attacks prevent benign its underlying reliance on NumPy. When defining ndar- inputfromenteringasystem. Conceptdrift[35]isapheno- ray objects from Python lists without explicitly specifying menon that occurs within machine learning systems as the adatatype,thelibraryinferstheresultingdatatypeaccor- prediction becomes less accurate over time due to unfore- ding to undocumented heuristics. The NumPy arrays are seen changes. Identifying concept drift, whether sudden or usedby Scikit-learnduringbothtrainingandtestingof the gradual, can be difficult in the presence of noise. Ideally, Linear Regression algorithm. This attack forces NumPy to machinelearningsystemsshouldcombinerobustnesstono- set the ndarray data type as object, which preserves the ise and sensitivity to concept drift. Adversarial drift [19] underlying data type of each element. The Scikit-learn sa- describes intentionally induced concept drift in an effort to nity checks ensure that the training and testing data types decrease the classification accuracy. Biggio et al. [3] descri- match. Because both the training and the testing arrays bed a threat model in which an attacker desires to conceal are of type object, the arrays pass the Scikit-learn checks. malicious input in an effort to evade detection without ne- Our proof-of-concept (PoC) code places Python float and gatively impacting the classification of legitimate samples. long values in the arrays before that data is ingested by According to Biggio, an attacker may wish to inject mali- the Scikit-learn module. When the input numbers are very cious input to subvert the clustering process, rendering the large, this results in a loss of precision from casting. Speci- resulting knowledge useless. The adversarial classifier re- fically,thePoCshowshowtheregressionmodelcoefficients verseengineering[16]describestechniquesforlearningsuffi- are drastically changed when using float instead of long cientinformationaboutaclassifiertoinstrumentadversarial 4https://github.com/rieck/malheur/blob/75ffd2498e964aa attacks. This information provides attackers and defenders 7d09782bf5a0d31afde36585f/src/fmath.c#L37 withanunderstandingofhowsusceptibletheirsystemisto 7 external adversarial influence. Newsome et al. [22] intro- and unique edge cases that result in bugs in the black box duce a delusive adversary that provides malicious input in system. anattempttoobstructtheMLtrainingphase;theattacker assumes full control over the input and its order. Anadversarydiscoveringthepossibilityof“linchpinvalues” that appear during feature matrix construction would be Inananalysisofaneuralnetworktrainedforimageproces- anotherdecisiveshifttowardsanattacker’sinfluenceonML sing tasks, Szegedy et al. [34] identified that an adversary systems. Linchpinvaluesareconsistentrangesofvalueswi- can apply a perturbation to an image that is imperceptible thinafeaturematrix,thatwhenpresent,resultinaspecific tohumansyetitchangesthenetwork’sprediction. Goodfel- classification. Ribeiro et al. [29] proposed a technique for lowetal. [13]presentafastmethodforgeneratingadversa- modelexplanationbybuildinglocallyoptimalclassifiersaro- rialperturbationstofoolanimageclassifier. Researchfrom undpointsofinterest. Buildingupontheirwork,researchers Chaetal. [5]furtherexploresautomatedgenerationofsuch mayapplyvariousanalytictechniquestodetermineifthere perturbations. Utilizing a well-formed seed input, a muta- are common values or thresholds within a feature matrix tionalfuzzeriterativelymanipulatestheseedtoachievema- that, when present, always result in a certain classification. ximum path traversal in a target program. This technique Withthisinformation,anattackercouldusesteeredfuzzing can isolate particular sets of input that cause the program to craft arbitrary input that would guarantee a misclassifi- to enter a state that might be of interest for an attacker. cation in the targeted system. Cretu et al. [7] discuss the process and importance of“cas- 7. CONCLUSIONS ting out demons,”sanitizing ML training datasets for ano- Entitiesthatchoosetotrustdatafromunvettedsourcessu- malydetection(AD)sensors. ADsystemsinherentlyreceive bjectthemselvestoaplethoraofpotentialattacksinwhich malicious input and anomalous events that may drastically a miscreant only requires minimal control over the entire impact the system’s tuning and instrumentation. Accoun- dataset. Foranattackerthatwishestocontrolthedecision- tingfordatathatmaynegativelyimpacttheaccuracyofthe makingprocessofitscompetitorsoradversaries,thisrepre- system’s classifier can enhance its overall robustness. sentsapowerfulparadigmshiftinattackvectors. Wedisco- vered several vulnerabilities within OpenCV and Malheur 6. DISCUSSIONANDFUTUREWORK that allow an attacker to exploit bugs in underlying depen- In this paper, we focus on attacks against machine learning dencies and the applications themselves to gain a marked systems. However, our threat model has a broader appli- advantage in influencing or out-right controlling the output cability. For example, nation-state adversaries might be of ML applications. interested in attacking long-running simulations on super- computers, with the aim of subtly skewing their results. In 8. REFERENCES highperformancecomputing,outputsaregenerallydifficult [1] M. Barreno, B. Nelson, A. D. Joseph, and J. Tygar. to validate and expensive to re-compute, so it is difficult to The security of machine learning. Machine Learning, defend against such attacks. Other data analytics systems 81(2):121–148, 2010. may also be susceptible to such attacks. [2] B. Biggio, B. Nelson, and P. Laskov. Support vector machines under adversarial label noise. In ACML, Forsomeofthebugsthatwediscovered,itisunclearwhois pages 97–112, 2011. responsibleforfixingthem. ShouldtheMalheurmaintainers [3] B. Biggio, I. Pillai, S. Rota Bul‘o, D. Ariu, M. Pelillo, havetoworryaboutbugsinlibarchiveinordertopreserve and F. Roli. Is data clustering in adversarial settings the integrity of their application? Should the architects of secure? In Proceedings of the 2013 ACM workshop on OpenCVsacrificeperformanceforthesakeofhandlinginva- Artificial intelligence and security, pages 87–98. ACM, lid input that developers did not filter? As more and more 2013. everyday devices begin to incorporate ML processing, this [4] G. Bradski. OpenCV. Dr. Dobb’s Journal of Software ambiguity must be explicitly resolved in order to provide Tools, 2000. secure systems. [5] S. K. Cha, M. Woo, and D. Brumley. Program-adaptive mutational fuzzing. In 2015 IEEE Section 3 describes a semi-automated approach for disco- Symposium on Security and Privacy, pages 725–741. vering bugs in machine learning platforms through catego- IEEE, 2015. rizing the backtrace of crash-inducing results. Tools such [6] A. Clark. Python PILLOW, 2015. as !exploitable [18] provide researchers with automated crash analysis and the likelihood that the crash is explo- [7] G. F. Cretu, A. Stavrou, M. E. Locasto, S. J. Stolfo, itable. Overlaying the findings from such a tool on top of and A. D. Keromytis. Casting out demons: Sanitizing our generalized attack surface could expedite the discovery training data for anomaly sensors. In Security and phase. Privacy, 2008. SP 2008. IEEE Symposium on, pages 81–95. IEEE, 2008. Section4exploresmanytechniquesthat,atfirstglance,are [8] CVEdetails. CVE-2015-4493: Heap-based buffer only feasible because the targeted source code is publicly overflow in the stagefright::ESDS::parseESDescriptor available. Recently,Papernotetal.[26]proposedmodelex- function in libstagefright in mozilla firefox bef, 2015. tractionattacks,bybuildingsurrogateclassifiersthatappro- [9] CVEdetails. Vulnerability note VU#862384, 2016. ximateblack-boxMLmodels. Alogicalnextstepinexpan- [10] G. E. Dahl, J. W. Stokes, L. Deng, and D. Yu. ding our research would be understanding the overlap be- Large-scale malware classification using random tweenbuildingsubstitutionmodelsofproprietaryclassifiers projections and neural networks. In IEEE 8 International Conference on Acoustics, Speech and should i trust you?”: Explaining the predictions of any Signal Processing, ICASSP 2013, Vancouver, BC, classifier. arXiv preprint arXiv:1602.04938, 2016. Canada, May 26-31, 2013, pages 3422–3426, 2013. [30] K.Rieck.Fixforproblemwithcorruptarchives.,2016. [11] T. Durden. Bypassing PAX ASLR protection. Phrack [31] K. Rieck, P. Trinius, C. Willems, and T. Holz. Magazine, 59(9):9–9, 2002. Automaticanalysisofmalwarebehaviorusingmachine [12] EnvironmentalProtectionAgency.EPA’scross-agency learning. Journal of Computer Security, 19(3), 2011. data analytics and visualization program | toxics [32] J. Sandhu. Machine Learning for Smart Home release inventory (TRI) program | us epa, 2015. Security Systems, 2016. [13] I. J. Goodfellow, J. Shlens, and C. Szegedy. [33] M. G. Schultz, E. Eskin, E. Zadok, and S. J. Stolfo. Explainingandharnessingadversarialexamples.arXiv Data mining methods for detection of new malicious preprint arXiv:1412.6572, 2014. executables. In 2001 IEEE Symposium on Security [14] Google Project Zero. Project zero: Hack the galaxy: and Privacy, Oakland, California, USA May 14-16, Hunting bugs in the samsung galaxy s6 edge, 2015. 2001, pages 38–49, 2001. [15] IBM. IBM social sentiment analysis powered by IBM [34] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, analytics – india, 2015. D. Erhan, I. Goodfellow, and R. Fergus. Intriguing [16] D. Lowd and C. Meek. Adversarial learning. In properties of neural networks. arXiv preprint Proceedings of the eleventh ACM SIGKDD arXiv:1312.6199, 2013. international conference on Knowledge discovery in [35] A. Tsymbal. The problem of concept drift: definitions data mining, pages 641–647. ACM, 2005. and related work. Computer Science Department, [17] G. McGraw. Software security: building security in, Trinity College Dublin, 106, 2004. volume 1. Addison-Wesley Professional, 2006. [36] M. Zalewski. American Fuzzy Lop, 2015. [18] Microsoft. !exploitable crash analyzer - msec debugger extensions, 2016. [19] B. Miller, A. Kantchelian, S. Afroz, R. Bachwani, E. Dauber, L. Huang, M. C. Tschantz, A. D. Joseph, and J. Tygar. Adversarial active learning. In Proceedings of the 2014 Workshop on Artificial Intelligent and Security Workshop, pages 3–14. ACM, 2014. [20] B. P. Miller, L. Fredriksen, and B. So. An empirical study of the reliability of UNIX utilities. Commun. ACM, 33(12):32–44, 1990. [21] NationalInstituteofStandardsandTechnology.Nvd- detail, 2015. [22] J. Newsome, B. Karp, and D. Song. Paragraph: Thwarting signature learning by training maliciously. In Recent advances in intrusion detection, pages 81–105. Springer, 2006. [23] P. Oehlert. Violating assumptions with fuzzing. Security & Privacy, IEEE, 3(2):58–62, 2005. [24] G. Ollmann. SQL Injection in the Wild, 2013. [25] OpenDNS.Cyberthreatintelligence|OpenDNS,2015. [26] N. Papernot, P. D. McDaniel, I. J. Goodfellow, S. Jha, Z.B.Celik,andA.Swami.Practicalblack-boxattacks against deep learning systems using adversarial examples. CoRR, abs/1602.02697, 2016. [27] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011. [28] R. Perdisci, W. Lee, and N. Feamster. Behavioral clustering of http-based malware and signature generation using malicious network traces. In Proceedings of the 7th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2010, April 28-30, 2010, San Jose, CA, USA, pages 391–404, 2010. [29] M. T. Ribeiro, S. Singh, and C. Guestrin. ”why 9

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.