evadeML.org Shrinking and Exploring David Evans University of Virginia Adversarial Search Spaces ARO Workshop on Adversarial Learning Stanford, 14 Sept 2017 Weilin Xu Yanjun Qi Machine Learning is Eating Computer Science 1 Security State-of-the-Art Random guessing attack success Threat models Proofs probability information "πππ Cryptography π theoretic, resource required bounded capabilities, "ππ System Security π motivations, common rationality Adversarial white-box, "ππ "π π *; π rare! Machine Learning black-box 2 Adversarial Examples βpandaβ + 0.007 Γ [ππππ π] = βgibbonβ Example from: Ian J. Goodfellow, Jonathon Shlens, Christian Szegedy. Explaining and Harnessing Adversarial Examples. 2014. 3 Adversarial Examples Game 6 Given seed sample, π₯, find π₯ where: 6 π π₯ β π(π₯) Class is different (untargeted) 6 π π₯ = π‘ Class is π‘ (targeted) 6 β π₯, π₯ β€ πΏ Difference below threshold 6 β π₯, π₯ is defined in some (simple!) metric space: πΏ πΏ πΏ πΏ βnorm (# different), norm, norm (βEuclideanβ), norm: @ A B C 4 Detecting Prediction 0 Model Adversarial Examples S q u e Model Adversarial e Prediction z e 1 r 1 Yes S q u π(ππππ , ππππ , β¦ , ππππ ) @ A K e Model e z Prediction e 2 r 2 Input No β¦ Legitimate S q u e Modelβ e z Prediction e k r k βFeature Squeezingβ π [0.054, 0.4894, 0.9258, 0.0116, 0.2898, 0.5222, 0.5074, β¦] Squeeze: π = round(π Γ4)/4 O O [0.0, 0.5, 1.0, 0.0, 0.25, 0.5, 0.5, β¦] 6 6 squeeze π β squeeze π βΉ π(squeeze π ) β π(squeeze π ) [0.0, 0.5, 1.0, 0.0, 0.25, 0.5, 0.5, β¦] Squeeze: π = round(π Γ4)/4 O O 6 π [0.0491, 0.4903, 0.9292, 0.009, 0.2942, 0.5243, 0.5078, β¦] 6 Example Squeezers e m e l o a r c h s c y o e n r g o m t i b t - i 3x3 smoothing: 8 b - Replace with median of pixels and its neighbors 1 Reduce Color Depth Median Smoothing 7 Simple Instantiation Model Prediction 0 (7-layer CNN) Adversarial Yes B i t D Model 1 e Prediction p 1 max πΏ π , π , πΏ π , π2 > π‘ t h A @ A A @ - Input No M Prediction 2 e 2 Legitimate Γ d Model 2 i a n s e 800 l p m a Legitimate 600 x E f o 400 r e Adversarial b m threshold = 0.0029 200 u detection: 98.2%, FP < 4% N 0 0.0 0.4 0.8 1.2 1.6 2.0 Maximum πΏ distance between original and squeezed input A 9
Description: