WhenAIhelpsWildlifeConservation: LearningAdversaryBehaviorsin GreenSecurityGames by DebarunKar ADissertationPresentedtothe FACULTYOFTHEGRADUATESCHOOL UNIVERSITYOFSOUTHERNCALIFORNIA InPartialFulfillmentofthe RequirementsfortheDegree DOCTOROFPHILOSOPHY (ComputerScience) June2017 Copyright 2017 DebarunKar Acknowledgements First, I would like to take this opportunity to thank myadvisor, Professor Milind Tambe. I want toexpressmyheartfeltgratitudetohimforpatientlyguidingmeduringthecourseofmydoctoral pursuitinthepastfouryears. Thisdissertationwouldnothavebeenpossiblewithouthisinvalu- ableguidance,prolificencouragement,constructivecriticismandpersistenthelp. Irealizedvery earlyonhowluckyIwastohavechosenMilindasmyadvisor,andmyrespectforhimhascon- stantlygrownovertheyearsandstillcontinuestodoso. Havingattendedsomanyconferences, I have seen first hand the respect and admiration he commands and the resultant respect we as his students get wherever we go. Milind, your hard work and dedication never ceases to amaze meandIhopetoemulatesuchanattitudeandaptitudeinbothmyfutureprofessionalcareerand personal life. Furthermore, it gives me immense satisfaction and a sense of relief to know that I haveamentorforlife. Iamdeeplyhumbledtobeyour25th graduatingPhDstudent. Next, I would like to thank Nicole Sintov for being a wonderful mentor during the early years of my PhD, guiding me along with Milind through my first set of research papers. Your invaluableinsightsonhumanbehavior,empiricalresearchmethodsandexperimentaldesignhave immenselybenefitedmyresearchovertheyears. Withoutourfrequentinteractions,myresearch wouldhavesuffereddearlyandsoI’llalwaysbegratefulforyourguidance. Iwouldalsoliketo ii thankallofmydissertationcommitteemembersfortheirtimetowardsmyresearchanddevelop- ment. My sincere gratitude to all of you: Jelena Mirkovic, Morteza Dehghani, Phebe Vayanos, RichardJohnandYilingChen. Havingsuchaninterdisciplinaryensembleofcommitteemembers helpedshapemythesisinwaysthatIcouldnothaveimagined. AllofmyprojectsatUSCwerepossibleduetocollaborationswithmanyamazingresearchers andrangersattheUgandaWildlifeAuthority,WildlifeConservationSociety,andWorldWildlife Fund. Theirrelentlesseffortstomaketheworldabetterandsaferplaceenabledmetoworkwith theminnotonlydevelopingnovelartificialintelligencemodelsbutmoreimportantlytestingthem out in the field with exciting results. Specifically, I want to thank Andrew Plumptre from WCS and Arnaud Lyet from WWF for working with us and providing us valuable data and insights throughout the course of this work. This made my PhD all the more fun and valuable and so I thankthemforgivingmethisopportunity. During my time at USC I have also had the honor to work with many great researchers, ei- therwhilecollaboratingonpapersorwhileorganizingworkshopsinconferences. Inparticular,I wouldliketothank: FeiFangandFrancescoDelleFaveforbeingamazingcollaborators,without your help many of my papers would never have happened; Ben Ford and Shahrzad Gholami for collaboratingwithmeonanexcitingwildlifeconservationpaper;SubhasreeSenguptaforhelping mecompletethefinalpaperinmyPhD;AruneshSinhaforbeinganamazingfriend,yourtremen- dousmathematicalexpertiseandforallowingmetoworkwithyouontheonlytheorypaperthat I will probably ever write; Jun-young Kwak for allowing me to work with you on the very first paperthatIwasapartofduringmyPhD;WilliamHaskellforbeingthecoolestmathematicianI haveeverseen,thankyouforourjointworkonfisheriesprotection;EceKamarandEricHorvitz iii forbeinginterestedinmyworkandforallthevaluablesuggestionsforourpaperonbeliefmod- eling;LizBondiwithwhomIhopetosignificantlyimpactthefieldofwildlifeconservationinthe coming years through the use of drone technology; and, Eugene Vorobeychik, Haifeng Xu and LongTran-Thanhforyourwillingnesstoworkwithmeinconductingtwosuccessfuladversarial reasoning workshops at AAMAS. I would also like to thank Thanh Nguyen, Matthew Brown, Yasaman Dehghani Abbasi, Pradeep Varakantham, Aram Galstyan and Bo An for working with me on some exciting problems. A big thank you to all the B.S. and M.S. students who have worked with me over the years and for allowing me to guide them: Brian Schwedock, Nikhil Cherukuri, Amit Plaha, Vivek Tiwari, Jie Zheng, Chiao-meng Huang, Soudhamini Radhakrish- nanandDonnabellDmello. ThisbringsmetotherestoftheTEAMCOREfamily,whichhasbeenbuiltthroughyearsof hardwork,dedicationandmutualrespectbyMilindandhisstudents,manyofwhomIhavehad the privilege of spending my PhD career with, while others whom I have met elsewhere. All of you have been a constant source of friendship and support and so I thank you all: Chao Zhang, Yundi Qian, Leandro Marcolino, Aaron Schlenker, Sara Mc Carthy, Bryan Wilder, Elizabeth Bondi,AidaRahmattalabi,EricShieh,RongYang,AlbertJiang,ManishJain,JamesPita,Chris Kiekintveld,MatthewTaylor,JanuszMarecki,PraveenParuchuri,PaulScerri,andMilind’svery firstPhDstudentGalKaminka. ThereisonememberoftheTEAMCOREfamilywhoIwantto thank immensely. Amulya Yadav, I am very grateful to have found a friend like you 9000 miles awayfromhomeandtohavesharedmyPhDlifewithyou. Youarenolessthanabrothertome. Thank you for the heartfelt discussions during all the good and bad times in my life, and thank youfortheconstantlaughs,yourinfectiousfree-spiritedattitudeandforbeingaconstantsource ofencouragement. Winmorebestpaperawardsbuddy! iv Iwouldalsoliketothankafewotherpeoplewhoplayedanimportantroleduringmystayat USC.ThankyouPradiptaGhoshandNasirMohammedforrushingtothehospitalsolateinthe nightofmyaccidentandformakingsurethatIreachedhomesafelyaftermytreatments. Thank you Sahil Garg for being a wonderful roommate and for all your help throughout my good and badtimes,especiallythetoughonemonthwhenIwasbed-riddenaftermyaccident. Thanksalso toallmyfriendsattheUSCBengaliorganizationformakingmefeelathome. Since the time I came to the United States, I have shared a major part of my life with one person inparticular. Thank youPoushali Banerjeefor all yourlove andsupport and forlighting upmylifeeverydayforthesepastfouryears. Iwanttoexpressmysincerestappreciationtoyou for being there with me in this long and rocky journey, even during the lengthy periods of time whenIretreatedtoworkingonmypapers. Sharingmylifewithyouhasbeenaconstantsource ofhappinessandalearningexperienceatthesametime,andforthatIthankyou. Finally, I want to thank my parents, the most important people in my life, for their constant motivationandsupportinallofmyendeavors. BabaandMa,withoutyourdedicationtowardsmy educationandwell-beingthroughouttheyearsthisresearchwouldnothavebeenpossible. Thank youforlettingmebeawayfromhomeforsomanyyearsnowsothatIcanfulfillmydreams,and thankyouforalwaysbeingthereinmygoodandbadtimes,nomatterhowinfrequentmyphone callscangetfromtimetotime. Thankyouimmenselyforyourpatience,kindnessandsupport. v Contents Acknowledgements ii ListOfFigures ix ListOfTables xii Abstract xiv 1 Introduction 1 1.1 Fine-grainedAdversaryModelingwithPlentifulAttackData . . . . . . . . . . . 4 1.2 Coarse-grainedAdversaryModelingwithSparseAttackData . . . . . . . . . . . 7 1.3 AdversaryModelingwithBeliefData . . . . . . . . . . . . . . . . . . . . . . . 9 1.4 ThesisOverview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2 Background 13 2.1 StackelbergSecurityGames . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2 HumanBehaviorModels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.2.1 SubjectiveUtilityQuantalResponse(SUQR) . . . . . . . . . . . . . . . 16 2.2.2 BayesianSUQR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2.3 RobustSUQR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3 RelatedWork 19 3.1 Researchinrepeatedgames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.1.1 LearninginrepeatedStackelberggames . . . . . . . . . . . . . . . . . . 19 3.1.2 Robuststrategiesinrepeatedgames . . . . . . . . . . . . . . . . . . . . 20 3.1.3 Learningfromreinforcementsinrepeatedgames . . . . . . . . . . . . . 21 3.2 RepeatedMeasuresStudies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.3 ProbabilityWeightingFunctions . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.4 Relatedresearchinbeliefmodeling . . . . . . . . . . . . . . . . . . . . . . . . 26 3.4.1 Settingwithouttrainingdata . . . . . . . . . . . . . . . . . . . . . . . . 26 3.4.2 Settingwithtrainingdata . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4 WildlifePoachingGame 31 4.1 GameOverview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.1.1 ComputationofPoacherReward . . . . . . . . . . . . . . . . . . . . . . 33 4.1.2 Non-zerosumgame . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.2 PayoffStructures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 vi 4.3 OnlineRepeatedMeasuresExperiments . . . . . . . . . . . . . . . . . . . . . . 36 4.3.1 ValidationandTrialGames. . . . . . . . . . . . . . . . . . . . . . . . . 37 4.3.2 ParticipantRetentionRate . . . . . . . . . . . . . . . . . . . . . . . . . 37 5 SHARP:ProbabilityWeighting 39 5.1 ProbabilityWeightingMechanism . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.2 Discussions: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 6 SHARP:AdaptiveUtilityModel 51 6.1 ObservationsandEvidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 6.2 SHARP’sUtilityComputation . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 6.3 GeneratingDefenderStrategiesAgainstSHARP . . . . . . . . . . . . . . . . . 61 6.4 SHARPinaction: Anexample . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 6.5 RL-SSG:ADescriptiveReinforcementLearningAlgorithmforSSGs . . . . . . 62 7 ResultswithHumanSubjectsonAMT 67 7.1 DefenderUtilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 7.2 LearnedProbabilityCurves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 7.2.1 ComparisonwithPrelec’sprobabilityweightingfunction . . . . . . . . . 75 7.2.2 ComparisonwithPWV-SUQR . . . . . . . . . . . . . . . . . . . . . . 76 7.3 Attacksurfaceexposure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 7.4 AdaptivenessofSHARP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 7.5 ValidationandTestingRobustnessofAMTfindings . . . . . . . . . . . . . . . . 80 7.5.1 ResultswithSecurityExpertsinIndonesia . . . . . . . . . . . . . . . . 82 7.5.2 Resultswithfractionofcompleteattackdata . . . . . . . . . . . . . . . 84 8 INTERCEPT 88 8.1 WildlifeCrimeDataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 8.1.1 DatasetChallenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 8.1.2 DatasetComposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 8.2 CAPTUREandProposedVariants . . . . . . . . . . . . . . . . . . . . . . . . . 90 8.3 INTERCEPT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 8.3.1 BoostIT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 8.3.2 INTERCEPT:EnsembleofExperts . . . . . . . . . . . . . . . . . . . . 97 9 INTERCEPTResults 99 9.1 EvaluationMetrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 9.2 EvaluationonHistoricalReal-worldPatrolData . . . . . . . . . . . . . . . . . . 101 9.2.1 AttackabilityPredictionResults . . . . . . . . . . . . . . . . . . . . . . 101 9.2.2 ObservationPredictionResults . . . . . . . . . . . . . . . . . . . . . . . 104 9.2.3 ImpactofEnsembleandVotingRules . . . . . . . . . . . . . . . . . . . 105 9.3 EvaluationonReal-WorldDeployment . . . . . . . . . . . . . . . . . . . . . . . 107 9.4 LessonsLearned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 vii 10 BeliefModeling 112 10.1 BeliefModelingGame . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 10.1.1 GameOverview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 10.1.2 ExperimentalProcedure . . . . . . . . . . . . . . . . . . . . . . . . . . 115 10.2 ProposedModels: Settingwithouttrainingdata . . . . . . . . . . . . . . . . . . 118 10.2.1 PerfectlyRationalAdversary . . . . . . . . . . . . . . . . . . . . . . . . 119 10.2.1.1 UninformedAdversary . . . . . . . . . . . . . . . . . . . . . 119 10.2.1.2 InformedAdversary . . . . . . . . . . . . . . . . . . . . . . . 121 10.2.2 BoundedlyRationalAdversary . . . . . . . . . . . . . . . . . . . . . . . 121 10.3 ProposedModels: Settingwithtrainingdata . . . . . . . . . . . . . . . . . . . . 124 10.3.1 PerfectlyRationalAdversary . . . . . . . . . . . . . . . . . . . . . . . . 125 10.3.2 BoundedlyRationalAdversary . . . . . . . . . . . . . . . . . . . . . . . 125 10.4 ProposedModels: Settingwithtrainingandtestingdata . . . . . . . . . . . . . . 129 10.4.1 InstancebasedLearningModels . . . . . . . . . . . . . . . . . . . . . . 129 10.4.2 ClusteringbasedModels . . . . . . . . . . . . . . . . . . . . . . . . . . 130 11 BeliefModelingExperimentResults 132 11.1 Settingwithouttrainingdata . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 11.2 Settingwithtrainingdata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 11.3 Settingwithtrainingandtestingdata . . . . . . . . . . . . . . . . . . . . . . . . 138 12 ConclusionsandFutureDirections 141 13 Appendix 145 13.1 AlgorithmtolearnPSUmodelparameters . . . . . . . . . . . . . . . . . . . . . 145 13.2 ProofofTheorem1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 13.3 SampleEmail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 13.4 ChallengesandRemediesofOnlineRepeatedMeasuresExperiments . . . . . . 151 13.4.1 Step1: PaymentScheme . . . . . . . . . . . . . . . . . . . . . . . . . . 154 13.4.2 InitialStudyEnrollment . . . . . . . . . . . . . . . . . . . . . . . . . . 157 13.4.3 ReminderEmails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 13.5 ParticipantFeedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 13.5.1 Feedbackfor”Pleasetellusaboutyourexperienceplayingthegame”: . . 160 13.5.2 Feedback for ”Did you use a particular strategy in playing the game? If yes,pleasespecify.”: . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 13.6 AdditionalExperimentalResultsonADS andADS . . . . . . . . . . . . . . 161 3 4 13.6.1 DefenderUtilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 13.6.2 LearnedProbabilityCurves . . . . . . . . . . . . . . . . . . . . . . . . 162 13.6.3 EvidenceofAttackSurfaceExposure . . . . . . . . . . . . . . . . . . . 163 13.6.4 AdaptivenessofSHARP . . . . . . . . . . . . . . . . . . . . . . . . . . 164 13.7 RobustnessofSHARP’sresultsacrossdomains . . . . . . . . . . . . . . . . . . 164 Bibliography 167 viii List Of Figures 1.1 DifferentGreenSecurityGame(GSG)domains. . . . . . . . . . . . . . . . . . . 3 3.1 ProbabilityWeightingFunctions . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.1 GameInterfaceforoursimulatedonlinerepeatedSSG(Reward,penaltyandcov- erageprobabilityforaselectedcellareshown) . . . . . . . . . . . . . . . . . . 32 4.2 Animaldensitystructures(ADS) . . . . . . . . . . . . . . . . . . . . . . . . . . 35 6.1 Evidenceforadaptivityofattackers . . . . . . . . . . . . . . . . . . . . . . . . 55 6.2 For various values of k (the number of nearest neighbors), percentage of people whoattackedsimilartargetsinround2aftersucceedingorfailingintheprevious round . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 6.3 (a,b,c): Round 2 strategies generated by SHARP (with discounting without At- tractiveness), SHARP (no discounting but with Attractiveness) and SHARP re- spectively;(d)ADS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 1 7.1 DefenderutilitiesforvariousmodelsonADS andADS respectively. . . . . . 73 1 2 7.2 CumulativedefenderutilitiesforvariousmodelsonADS andADS respectively. 74 1 2 7.3 (a) Comparison of defender utilities between P-SUQR, SHARP and SHARP(d=1)onADS andADS respectively . . . . . . . . . . . . . . . . . . 74 1 2 7.4 ComparisonofvariousmodelswithSUQR(PureStrategy)andtheRLmodelon ADS respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 1 7.5 RLmodelbaseddefenderstrategyforround2onADS . . . . . . . . . . . . . . 75 1 7.6 LearnedprobabilitycurvesforP-SUQRonADS andADS respectively. . . . 75 1 2 ix 7.7 Learned probability curves with Prelec’s probability weighting function for P- SUQRonADS andADS respectively. . . . . . . . . . . . . . . . . . . . . . 76 1 2 7.8 (a) Comparison of the sum of squared errors for P-SUQR with Gonzalez and Wu, and P-SUQR with Prelec’s probability weighting function respectively; (b) ComparisonofthesumofsquarederrorsforP-SUQRandPWV-SUQRrespectively 77 7.9 (a) - (d) Learned probability curves for PWV-SUQR on ADS and ADS re- 1 2 spectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 7.10 Totalnumberofuniqueexposedtargetprofilestilltheendofeachroundforeach coverageprobabilityintervalforADS andADS . . . . . . . . . . . . . . . . . 79 1 2 7.11 Adaptivity of SHARP and Convergence of P-SUQR on payoff structures ADS 1 andADS respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 2 7.12 SHARPbasedstrategyforthedefenderonpayoffstructureADS . . . . . . . . . 81 2 7.13 DefenderutilityforSHARPagainstsecurityexpertsinIndonesia . . . . . . . . . 83 7.14 Evidemceforadaptivityofattackers(securityexpertsinIndonesia) . . . . . . . . 83 7.15 Totalnumberofuniqueexposedtargetprofilestilltheendofeachroundforeach coverageprobabilityintervalforIndonesiaexpertsdata. . . . . . . . . . . . . . . 84 7.16 LearnedprobabilitycurvesforSHARPonADS onthesecurityexpertsdataset. 85 3 7.17 (a) and (b): Average 1-norm distances between defender strategies generated by SHARP when the model is learned based on randomly sampled 50% data (0%, 5%, 10% and 15% deviation from actual data) and when the model is learned fromthecompletedataset. ResultsareshownforADS andADS respectively. 86 1 2 7.18 (a) and (b): Average 1-norm distances between defender strategies generated by SHARP when the model is learned based on randomly sampled 50% data (0%, 5%, 10% and 15% deviation from actual data) and when the model is learned fromthecompletedataset. ResultsareshownforADS andADS respectively. 87 3 4 8.1 QENP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 9.1 Illegal activities detected by rangers directed by INTERCEPT. Photo credit: UgandaWildlifeAuthorityranger . . . . . . . . . . . . . . . . . . . . . . . . . 109 10.1 GameInterfaceforsimulatedonlinebeliefmodelinggame . . . . . . . . . . . . . . . 113 10.2 (a,b): AnimalDensities;(c)Maximin;(d)SUQR . . . . . . . . . . . . . . . . . 116 x
Description: