ebook img

GAPs: Geospatial Abduction Problems PDF

27 Pages·2011·0.79 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview GAPs: Geospatial Abduction Problems

GAPs: Geospatial Abduction Problems PAULOSHAKARIAN andV.S.SUBRAHMANIAN,UniversityofMaryland MARIALUISASAPINO,Universita` diTorino Therearemanyapplicationswhereweobservevariousphenomenainspace(e.g.,locationsofvictimsofa serialkiller),andwherewewanttoinfer“partner”locations(e.g.,thelocationwherethekillerlives)that aregeospatiallyrelatedtotheobservedphenomena.Inthisarticle,wedefinegeospatialabductionproblems (GAPs for short). We analyze the complexity of GAPs, develop exact and approximate algorithms (often withapproximationguarantees)fortheseproblemstogetherwithanalysesofthesealgorithms,anddevelop aprototypeimplementationofourGAPframework. Wedemonstrateaccuracyofouralgorithmsonareal worlddatasetconsistingofinsurgentIED(improvisedexplosivedevice)attacksagainstU.S.forcesinIraq (theobservationswerethelocationsoftheattacks,whilethe“partner”locationsweweretryingtoinferwere thelocationsofIEDweaponscaches). Categories and Subject Descriptors: I.2.3 [Artificial Intelligence]: Deduction and Theorem Proving— Nonmonotonic reasoning and belief revision; I.2.3 [Artificial Intelligence]: Problem Solving, Control Methods, and Search—Heuristic methods; I.2.1 [Artificial Intelligence]: Applications and Expert Systems—Cartography GeneralTerms:Theory,Algorithms,Experimentation AdditionalKeyWordsandPhrases:Abduction,complexityanalysis,heuristicalgorithms ACMReferenceFormat: Shakarian,P.,Subrahmanian,V.S.,andSapino,M.L.2011. GAPs: Geospatialabductionproblems. ACM Trans.Intell.Syst.Technol.3,1,Article7(October2011),27pages. DOI=10.1145/2036264.2036271http://doi.acm.org/10.1145/2036264.2036271 1. INTRODUCTION There are numerous applications where we wish to draw geospatial inferences from observations. Forexample,criminologists[RossmoandRombouts2008;Brantingham andBrantingham2008]havefoundthattherearespatialrelationshipsbetweenase- rialkiller’shouse(thegeospatialinferencewewishtomake),andlocationswherethe crimeswere committed (the observations). A marine archaeologistwho finds parts of a wrecked ship or its cargo at variouslocations (the observations) is interested in de- terminingwherethemainportionofthewrecklies(thegeospatialinference). Wildlife expertsmightfind droppingsof an endangeredspeciessuch as the Malayan sun bear (observations) and might want to determine where the bear’s den is (the geospatial inference to be made). In all these cases, we are trying to find a single location that 7 SomeoftheauthorsofthisarticlewerefundedinpartbyAFOSRgrantFA95500610405andAROgrants W911NF0910206andW911NF0910525. Author’s addresses: P. Shakarian, U.S. Military Academy, West Point, NY 10996; email: [email protected]; V. S. Subrahmanian, University of Maryland, College Park, MD 20742; email: [email protected];M.L.Sapino,UniversitadiTorino,Torino,Italy;email:[email protected] Permissiontomakedigitalorhardcopiesofpartorallofthisworkforpersonalorclassroomuseisgranted withoutfeeprovidedthatcopiesarenotmadeordistributedforprofitorcommercial advantageandthat copiesshowthisnoticeonthefirstpageorinitialscreenofadisplayalongwiththefullcitation.Copyrights forcomponentsofthisworkownedbyothersthanACMmustbehonored.Abstractingwithcreditispermit- ted. Tocopyotherwise,torepublish,topostonservers,toredistributetolists,ortouseanycomponentof thisworkinotherworksrequirespriorspecificpermissionand/orafee.Permissionsmayberequestedfrom thePublicationsDept.,ACM,Inc.,2PennPlaza,Suite701,NewYork, NY10121-0701USA,fax+1(212) 869-0481,[email protected]. (cid:2)c 2011ACM2157-6904/2011/10-ART7$10.00 DOI10.1145/2036264.2036271http://doi.acm.org/10.1145/2036264.2036271 ACMTransactionsonIntelligentSystemsandTechnology,Vol.3,No.1,Article7,Publicationdate:October2011. 7:2 P.Shakarianetal. best explains the observations (or the k locations that best explain the observations). Therearetwocommonelementsinsuchapplications. First,thereisasetOofobservationsofthephenomenaunderstudy. Forthesakeof simplicity,weassumethattheseobservationsarepointswherethephenomenonbeing studied was known to have been present. Second, there is some domain knowledge D specifying known relationships between the geospatial location we are trying to find and the observations. For instance, in the serial killer application, the domain knowledgemighttellusthatserialkillersusuallyselectlocationsfortheircrimesthat areatleast1.2kmfromtheirhomesandatmost3kmfromtheirhomes. Inthecaseof the sun bear, the domain knowledge might state that the sun bear usually prefersto haveadeninacave,whileinthecaseofthewreck,itmightbeusuallywithinaradius of10milesoftheartifactsthathavebeenfound. Thegeospatialabductionproblem(GAPforshort)istheproblemoffindingthemost likely set of locations that is compatible with the domain knowledge D and that best “explains” the observations in O. To see why we look for a set of locations, we note thattheserialkillermightbeusingbothhishomeandhisofficeaslaunchingpadsfor his attacks. In this case, no single location may best accountfor the observations. In thispaper,weshowthatmanynaturalproblemsassociatedwithgeospatialabduction are NP-Complete, which causes us to resort to approximation techniques. We then showthatcertaingeospatialabductionproblemsreducetoseveralwell-studiedcombi- natorialproblemsthathave viableapproximationalgorithms. We implementsomeof themoreviableapproacheswithheuristicssuitableforgeospatialabduction,andtest themonareal-worlddata-set. Theorganizationandmaincontributionsofthisarticle areasfollows. —Section 2 formally defines geospatial abduction problems (GAPs for short), and Section3analyzestheircomplexity. —Section 4 develops a “naive” algorithm for a basic geospatial abduction problem called k-SEP and shows reductions to set-covering, dominating set, and linear- integer programming that allow well-known algorithms for these problems to be appliedtoGAPs. —Section5describestwogreedyalgorithmsfork-SEPandcomparesthemtoareduc- tiontotheset-coveringproblem. —Section 6 describes our implementation and shows that our greedy algorithms outperform the set-covering reduction in a real-world application on identifying weapons caches associated with Improvised Explosive Device (IED) attacks on US troopsinIraq. Weshowthatevenifwesimplifyk-SEPtoonlycaseswherek-means classificationalgorithmswork,ouralgorithmsoutperformthose. —Section7comparesourapproachwithrelatedwork. 2. GEOSPATIALABDUCTIONPROBLEM(GAP)DEFINITION Throughout this article, we assume the existence of a finite, 2-dimensional M × N space S1 for someintegers M,N ≥ 1 called the geospatialuniverse(orjust universe). Each point p ∈ S is of the form (x,y) where x,y are integers and 0 ≤ x ≤ M and 0 ≤ y ≤ N. We assume that all observations we make occur within space S. We use the space shown in Figure 1 throughout this article to illustrate the concepts we introduce. We assume that S has an associated distance function d which assigns a nonnegativedistancetoanytwopointsandsatisfiestheusualdistanceaxioms.2 1Weuseintegercoordinatesasmostrealworldgeospatialinformationsystems(GIS)systemsusediscrete spatialrepresentations. 2d(x,x)=0;d(x,y)=d(y,x);d(x,y)+d(y,z)≥d(x,z). ACMTransactionsonIntelligentSystemsandTechnology,Vol.3,No.1,Article7,Publicationdate:October2011. GAPs: GeospatialAbductionProblems 7:3 Fig.1. Aspace.Reddotsdenoteobservations.Yellowsquaresdenoteinfeasiblelocations.Greenstarsshow one(0,3)explanation,whilepinktrianglesshowanother(0,3)explanation. Definition 2.1 Observation. AnobservationOisanyfinitesubsetofS. ConsiderthegeospatialuniverseshowninFigure1. Intheserialkillerapplication,the reddotswouldindicatethelocationsofthemurders,whileintheshipwreckexample, theywouldindicate the locationswhereartifacts werefound. We wish to identifythe killer’slocation(orthesunkenshiporthesunbear’sden). Asmentionedearlier,therearemanyconstraintsthatgovernwheresuchlocations might be. For instance, it is unlikely that the sunbear’s den (or the killer’s house or office)isinthewater,whilethesunkenshipisunlikelytobeonland. Definition 2.2 Feasibilitypredicate. AfeasibilitypredicatefeasisafunctionfromS to{TRUE,FALSE}. Thus, feas(p) = TRUE means that point p is feasible and must be considered in the search. Figure 1 denotes infeasible places via a yellow square. Throughoutthis arti- cle,weassumethatfeasisanarbitrary,butfixedpredicate.3 Further,asfeasisdefined asafunctionover{TRUE,FALSE},itcan allowforuser inputbased onanalytical pro- cessescurrentlyin place. For instance,in themilitary, analysts oftencreate“MCOO” overlayswhere“restrictedterrain”isdeemedinfeasible[USArmy1994]. Wecanalso easilyexpressfeasibility predicatesinaProlog-stylelanguage;wecaneasilystate (in the serial killer example) that point p is considered feasible if p is within R units of distancefromsomeobservationand pisnotin thewater. Likewise,in thecaseofthe sun bear example,the same languagemightstate that pis consideredfeasible if pis within R unitsof distance frommarks on trees,within R unitsof scat, and if phas 1 2 some landcover that would allow the bear to hide. A Prolog-style language that can express such notions of feasibility is the hybrid knowledge base paradigm [Lu et al. 1996]inwhichPrologstylerulescandirectlyinvokeaGISsystem. Definition 2.3 (α,β)Explanation. Suppose O is a finite set of observations, E is a finite set of points in S, and α ≥ 0, β > 0 are some real numbers. E is said to be an (α,β)explanationofO iff: — p∈E impliesthatfeas(p)=TRUE,thatis,allpointsinE arefeasibleand —(∀o ∈ O)(∃p ∈ E)α ≤ d(p,o) ≤ β, that is, every observation is neither too close nor toofarfromsomepointinE. Thus, an (α,β) explanation is a set of points (e.g., denoting the possible locations of the home/officeof the serial killer or the possible locations of the bear’s den). Each 3Wealsoassumethroughoutthearticlethatfeasiscomputableinconstanttime.Thisisarealisticassump- tion,asformostapplications,weassumefeastobeuser-defined. Hence,wecanleverageadata-structure indexedwiththecoordinatesofStoallowforconstant-timecomputation. ACMTransactionsonIntelligentSystemsandTechnology,Vol.3,No.1,Article7,Publicationdate:October2011. 7:4 P.Shakarianetal. point must be feasible and every observation must have an analogous point in the explanationwhichisneithertooclosenortoofar. Given an (α,β) explanation E, there may be an observation o ∈ O such that there aretwo(ormore)points p ,p ∈E satisfyingtheconditionsofthesecondbulletabove. 1 2 If E is an explanation for O, a partnering function ℘E is a function from O to E such that for all o ∈ O, α ≤ d(℘E(o),o) ≤ β. ℘E(o) is said to be o’s partner according to the partneringfunction℘E. Wenowpresentasimpleexampleof(α,β)explanations. Example 2.1. ConsidertheobservationsinFigure1andsupposeα =0,β =3. Then the two green stars denote an (α,β) explanation, that is, the set {(6,6),(12,8)} is a (0,3) explanation. So is the set of three pink triangles, that is, the set {(5,6),(10,6), (13,9)}isalsoan(0,3)explanation. Thebasicproblemthatwewishtosolveinthisarticleisthefollowing. TheSimple(α,β)ExplanationProblem(SEP) INPUT: Space S, a set O of observations, a feasibility predicate feas, and numbers α ≥0,β >0. OUTPUT:“Yes”ifthereexistsan(α,β)explanationforO —“no”otherwise. A variant of this problem is the k-SEP problem which requires, in addition, that E containskelementsorless,fork<|O|. Yetanothervariantoftheproblemtriestofind anexplanationE thatis“best”accordingtosomecostfunction. Definition 2.4 Costfunctionχ. Acostfunctionχ isamappingfromexplanationsto nonnegativereals. We willassume thatcostfunctionsaredesignedso thatthe smaller the valuethey return, the more desirable an explanation is. Some example cost functions are given below. Thesimpleonebelowmerelylooksatthemeandistancesbetweenobservations andtheirpartners. Example 2.2 Mean-distance. SupposeS,O,feas,α,β areallgivenandsupposeE is an (α,β) explanationfor O and ℘E is a partneringfunction. We could initially set the costofanexplanationE (withrespecttothispartneringfunction)tobe: χ℘E(E) = (cid:6)o∈O d|(Oo,|℘E(o)). Suppose ptn(E) is the set of all partner functions for E in the above setting. Then we cansetthecostofE as: χmean(E) = inf{χ℘E(E)|℘E ∈ ptn(E)}. This definition removesreliance on a single partneringfunction as there may be sev- eralpartneringfunctionsassociatedwithasingleexplanation. Weillustratethisdefi- nitionusingoursunbearexample. Example 2.3. Wildlife experts have found droppings and other evidence of the Malayan sun bear in a given space, S, depicted in Figure 2. Points {o ,o ,o } indi- 1 2 3 cate locations of evidence of the Malayan sun bear (we shall refer to these as set O). Points {p ,p ,...,p } indicate feasible dwellings for the bear. The concentric rings 1 2 8 around each element of O indicate the distance α = 1.7km and β = 3.7km. The set {p ,p } is a valid (1.7,3.7) explanation for the set of evidence, O. However, we note 3 6 thatobservation o can be partneredwith either point. If weare looking to minimize 2 ACMTransactionsonIntelligentSystemsandTechnology,Vol.3,No.1,Article7,Publicationdate:October2011. GAPs: GeospatialAbductionProblems 7:5 Fig.2. Left: Points {o1,o2,o3} indicate locations of evidence of theMalayan sun bear (we shall refer to theseassetO).Points{p1,p2,...,p8}indicatefeasibledwellingsforthebear. Theconcentricringsaround eachelementofO indicatethedistanceα = 1.7kmandβ = 3.7km. Right: Points{p1,p2,p3}arefeasible forcrime-scenes{o1,o2}. {p1,p2}aresafe-houseswithinadistanceof[1,2]km. fromcrimesceneo1 and {p2,p3}aresafe-houseswithinadistanceof[1,2]km.fromcrimesceneo2. distance,wenoticethatd(o ,p )=3kmandd(o ,p )=3.6km;hence, p isthepartner 2 3 2 6 3 foro suchthatthedistanceisminimized. 2 Wenowdefinean“optimal”explanationasonethatminimizescost. Definition 2.5. Suppose O is a finite set of observations, E is a finite set of points in S, α ≥ 0, β > 0 are some real numbers, and χ is a cost function. E is said to be an optimal(α,β) explanation iff E is an (α,β) explanation for O and there is no other (α,β)explanationE(cid:7) forO suchthatχ(E(cid:7))<χ(E). Wepresentanexampleofoptimal(α,β)explanationsinthefollowing. Example 2.4. Consider the sun bear from Example 2.3 whose behavior is depicted in Figure 2 (left). While {p ,p } is a valid solution for the k-SEP problem (k = 2), 3 6 it does not optimize mean distance. In this case the mean distance would be 3km. However,thesolution{p ,p }providesamean-distanceof2.8km. 3 7 SupposewearetrackingaserialkillerwhohasstruckatlocationsO={o ,o }. The 1 2 points {p ,p ,p } are feasible locations as safe houses for the killer (partners). This 1 2 3 is depicted in Figure 2 (right). Based on historical data, we know that serial killers strikesareatleast1kmawayfromasafe-houseandatmost2kmfromthesafehouse (α = 1, β = 2). Thus, for k = 2, any valid explanation of size 2 provides an optimal solutionwrtmean-distanceaseveryfeasiblelocationforasafe-houseiswithin2kmof acrimescene. Wearenowreadytodefinethecost-basedexplanationproblem. TheCost-based(α,β)ExplanationProblem. INPUT: Space S, a set O of observations, a feasibility predicate feas, numbersα ≥ 0, β >0,acostfunctionχ andarealnumberv >0. OUTPUT: “Yes” if there exists an (α,β) explanation E for O such that χ(E) ≤ v, “no” otherwise. Itiseasytoseethatstandardclassificationproblemslikek-means4canbecaptured within our framework by simply assuming that α = 0, β > max(M,N)2 and that all 4SeeAlpaydin[2010]forasurveyonclassificationwork. ACMTransactionsonIntelligentSystemsandTechnology,Vol.3,No.1,Article7,Publicationdate:October2011. 7:6 P.Shakarianetal. points are feasible. In contrast, standard classification algorithms cannot take feasi- bilityintoaccount,andthisisessentialfortheabovetypesofapplications. 3. COMPLEXITYOFGAPPROBLEMS SEPcan be easily solvedin PTIME. GivenasetO ofobservations,foreacho ∈ O, let P = {p ∈ S | feas(p) = TRUE ∧ α ≤ d(p,o) ≤ β}. If P (cid:9)= ∅ for each o, we return o o “yes.” We call this algorithm STRAIGHTFORWARD-SEP. Another algorithm would merely find the set F of all feasible points and return “yes” iff for every observation o, there is at least one point p ∈ F such that α ≤ d(p,o) ≤ β. In this case, F is the explanation produced, but it is a very poor explanation. In the serial killer example, F merelytells thepoliceto search all feasible locationswithouttrying to doanything intelligent. k-SEPallowstheusertoconstrainthesizeoftheexplanationsothat“short andsweet”explanationsthataretrulymeaningfulareproduced. Thefollowingresult states that k-SEP is NP-Complete; the proof is a reduction from Geometric Covering byDiscs(GCD)[Johnson1982]. THEOREM 3.1. k-SEPisNP-Complete. In the associated optimization problem with k-SEP, we wish to produce an expla- nationofminimumcardinality. Notethatminimumcardinalityisacommoncriterion for parsimony in abduction problems [Reggia and Peng 1990]. We shall refer to this problem as MINSEP. This problem is obviously NP-hard by Theorem 3.1. We can ad- justSTRAIGHTFORWARD-SEPtofind asolution toMINSEP by findingthe minimum hittingsetofthe P ’s. o Example 3.1. Consider the serial killer scenario in Example 2.4 and Figure 2 (right). Crime scene (observation) o can be partnered with two possible safe houses 1 {p ,p } and crime scene o can be partnered with {p ,p }. We immediately see that 1 2 2 2 3 the potential safe house located at p is in both sets. Therefore, p is an explana- 2 2 tion for both crime scenes. As this is the only such point, we conclude that {p } is 2 the minimum-sized solution for the SEP problem. However, while it is possible for STRAIGHTFORWARD-SEP to return this set, there are no assurances it does. As we saw in Example 2.4, E = {p ,p } is a solution to SEP, although a solution with lower 1 2 cardinality({p })exists. ThisiswhyweintroducetheMINSEPproblem. 2 With the complexity of k-SEP, the following corollary tells us the complexity class oftheCost-basedExplanationproblem. Weshowthisreductionbysimplysetting the costfunctionχ(E)=|E|. COROLLARY 3.1. Cost-basedExplanationisNP-Complete. As described earlier, MINSEP has the feel of a set-covering problem. Although the generalizedcost-based explanation cannotbe directly viewedwith a similar intuition (as the cost maps explanations to reals, not elements of S), there is an important variantofthe Cost-based problemthatdoes. We introduceweightedSEP, orWT-SEP here. WeightedSpatialExplanation(WT-SEP) INPUT:AspaceS,asetOofobservations,afeasibilitypredicatefeas,numbersα ≥0, β >0,aweightfunctionc:S →(cid:12),andarealnumberv >0. (cid:2) OUTPUT:“Yes”ifthereexistsan(α,β)explanationE forOsuchthat c(p)≤v — p∈E “no”otherwise. ACMTransactionsonIntelligentSystemsandTechnology,Vol.3,No.1,Article7,Publicationdate:October2011. GAPs: GeospatialAbductionProblems 7:7 In this case, we can easily show NP-Completeness by redu(cid:2)ction from k-SEP, we simply set the weight for each elementof S to be one, causing c(p) to equal the p∈E cardinalityofE. COROLLARY 3.2. WT-SEPisNP-Complete. Cost-basedexplanationproblemspresentedinthissectionareverygeneral. Whilethe complexity results hold for an arbitrary function in a general case, we also consider specificfunctionsaswell. Inthefollowing,wepresentthetotal-distanceminimization explanation problem (TD-SEP). This is a problem where we seek to minimize the sum of distances between observations and their closest partners while imposing a restrictiononcardinality. TotalDistanceMinimizationExplanationProblem(TD-SEP) ForspaceS,letd:S×S →(cid:12)betheEuclideandistancebetweentwopointsinS. INPUT:AspaceS,asetOofobservations,afeasibilitypredicatefeas,numbersα ≥0, β >0,positiveintegerk<|O|,andrealnumberv >0. O(cid:2)UTPUT: “Yes” if there exists an (α,β) explanation E for O such that |E| = k and oi∈Ominpj∈Ed(oi,pj)≤v,“no”otherwise. THEOREM 3.2. TD-SEPisNP-Complete. The NP-hardness of the TD-SEP is based on a reduction from the k-Median Prob- lem [Papadimitriou 1981]. This particular reduction (details in the appendix) also illustrates how the k-median problem is a special case of GAPs, but k-median prob- lems cannot handle arbitrary feasibility predicates of the kind that occur in real-life geospatialreasoning. Thesameargumentappliestok-meansclassifiersaswell. 4. EXACTALGORITHMFORGAPPROBLEMS Thissectionpresentsfourexactapproachestosolvek-SEPandWT-SEP.First,wepro- vide an enumerative approach that exhaustively searches for an explanation. Then, we show that the problem reduces to set-cover, dominating set, and linear-integer programming. Existing algorithms for these problems can hence be used directly. Throughout this section, we shall use the symbols (cid:7) to represent the bound on the number of partners that can be associated with a single observation and f to repre- sent the bound on the number of observations supported by a single partner. Note thatbothvaluesareboundedbyπ(β2−α2),howevertheycanbemuchlessinpractice; specifically, f isnormallymuchsmallerthan(cid:7). 4.1 NaiveExactAlgorithm We now show correctness of NAIVE-KSEP-EXACT. This algorithm provides an exact solution to k-SEP but takes exponential time (in k). The algorithm first identifies a set L of all elementsof S that could be possible partnersfor O. Then, it considersall subsets of L of size less than or equal to k. It does this until it identifies one such subsetasanexplanation. PROPOSITION 4.1. Ifthereisak-sizedsimple(α,β)explanationforO,thenNAIVE- KSEP-EXACTreturnsanexplanation.Otherwise,itreturnsNO. Finally,wehavethecomplexityofthealgorithm. PROPOSITION 4.2. The complexity of NAIVE-KSEP-EXACT is O((k−11)!(π(β2 − α2)|O|)(k+1)). ACMTransactionsonIntelligentSystemsandTechnology,Vol.3,No.1,Article7,Publicationdate:October2011. 7:8 P.Shakarianetal. Algorithm1(NAIVE-KSEP-EXACT) INPUT:SpaceS,asetOofobservations,afeasibilitypredicatefeas,realnumbersα≥0,β >0, andnaturalnumberk>0 OUTPUT:SetE ⊆S ofsizek(orless)thatexplainsO (1) Let Mbeamatrixarrayofpointerstobinarystring{0,1}|O|. Misofthesamedimensions asS. Eachelementin MisinitializedtoNULL.Foragiven p∈S, M[p]istheplaceinthe array. (2) LetL bealistofpointerstobinarystrings. L isinitializedasnull. (3) Foreacho ∈Odothefollowing i (a) Determineallpoints p∈Ssuchthatα≤d(o,p)≤β suchthatfeas(p)=TRUE. (b) Foreachofthesepoints, p,if M[p]=NULLtheninitializeanewarraywhereonlybiti issetto1. Thenaddapointerto M[p]inL. (c) Otherwise,setbitioftheexistingarrayto1. (4) For any k elements of L (actually the k elements pointed to by elements of L), we shall designate(cid:9) ,...,(cid:9),...(cid:9) astheelements. Wewillrefertotheithbitofelement(cid:9) as(cid:9)(i). 1 j k j j (5) Exhaustivelygenerate allpossible(cid:2)combinationsof k elements of L untilonesuch combi- nationisfoundwhere∀i∈[1,|O|], k ((cid:9)(i))>0 j=1 j (6) If nosuch combinationis found, return NO. Otherwise, return thefirstcombination that wasfound. An exact algorithm for the cost-based explanation problems follows trivially from theNAIVE-KSEP-EXACTalgorithmbyaddingthestepofcomputingthevalueforχ for eachcombination. Providedthiscomputationtakesconstanttime,thisdoesnotaffect the O( 1 (π(β2−α2)|O|)(k+1))runtimeofthatalgorithm. (k−1)! 4.2 AnExactSet-CoverBasedApproach We now show that k-SEP polynomially reduces to an instance of the popular set- coveringproblem[Karp1972]whichallowsustodirectlyapplythewell-knowngreedy algorithmreviewedinPaschos[1997]. Set-Coverisdefinedasfollows. TheSet-CoverProblem(Set-Cover) INPUT: Set of elements, E and a family of subsets of E, F ≡ {S ,...,S }, and 1 max positiveintegerk. (cid:3) OUTPUT:“Yes”ifthereexistsak-sizedsubsetof F, F ,suchthat k {S ∈ F }≡ E. k i=1 i k Through a simple modification of NAIVE-KSEP-EXACT, we can take an instance of k-SEP and produce an instance of Set-Cover. We run the first four steps, which only takes O((cid:7)·|O|)timebytheproofofProposition4.2. THEOREM 4.1. k-SEPpolynomiallyreducestoSet-Cover. Example 4.1. Consider the serial killer scenario in Example 2.4 and Figure 2 (right). Suppose we want to solve this problem as an instance of k-SEP by a reduc- tion to set-cover. We consider the set of crime-scene locations, O ≡ {o ,o } as the set 1 2 wewishtocover.WeobtainourcoversfromthefirstfourstepsofNAIVE-KSEP-EXACT. Let us call the result list L. Hence, we can view the values of the elements in L as the following sets S ≡ {o },S ≡ {o ,o },S ≡ {o }. These correspond with points 1 1 2 1 2 3 2 p ,p ,p respectively. As S coversO, p isanexplanation. 1 2 3 2 2 The traditional approach for approximation of set-cover has a time complexity of O(|E|·|F|·size),wheresizeisthecardinalityofthelargestsetinthefamily F(i.e. size= ACMTransactionsonIntelligentSystemsandTechnology,Vol.3,No.1,Article7,Publicationdate:October2011. GAPs: GeospatialAbductionProblems 7:9 Algorithm2(NAIVE-KSEP-SC) INPUT:SpaceS,asetO ofobservations,afeasibilitypredicatefeas, andrealnumbersα ≥ 0, β >0 OUTPUT:SetE ⊆S thatexplainsO (1) InitializelistE tonull (2) Let MbeamatrixarrayofthesamedimensionsasS oflistsofpointersinitializedtonull. Foragiven p∈S, M[p]istheplaceinthearray. (3) LetL bealistofpointerstolistsin M,L isinitializedtonull. (4) LetO(cid:7) beanarrayofBooleansoflength|O|. ∀i∈[1,|O|],initializeO(cid:7)[i]=TRUE. Forsome elemento ∈O,O(cid:7)[o]isthecorrespondingspaceinthearray. (5) LetnumObs=|O| (6) Foreachelemento ∈O,dothefollowing. (a) Determineallelements p∈S suchthatfeas(p)=TRUEandd(o,p)∈[α,β] (b) Iftheredoesnotexista p∈S meetingtheabovecriteria,thenterminatetheprogram andreturnIMPOSSIBLE. (c) If M[p]=nullthenaddapointerto M[p]toL (d) Addapointertoo tothelistM[p]. (7) WhilenumObs>0loop (a) Initializepointercur ptrtonull (b) Initializeintegercur sizeto0 (c) Foreach ptr∈L,dothefollowing: (i) Initializeintegerthissizeto0 (ii) Let M[p]betheelementof Mpointedtoby ptr (iii) Foreachobs ptrinthelistM[p],dothefollowing A. LetibethecorrespondinglocationinarrayO(cid:7)toobs ptr B. IfO(cid:7)[i]=TRUE,incrementthissizeby1 (iv) Ifthissize>cursize,setcursize=thissizeandhavecur ptrpointto M[p] (d) Add ptoE (e) Foreveryobs ptrinthelistpointedtobycur ptr,dothefollowing: (i) LetibethecorrespondinglocationinarrayO(cid:7) toobs ptr (ii) IfO(cid:7)[i],thensetittoFALSEanddecrementnumObsby1 (f) AddthelocationinspaceS pointedtobycur ptrtoE (8) ReturnE maxi≤|F||Si|). This approach obtains an approximation ratio of 1 + ln(size) [Paschos 1997]. As f isthequantityofthelargestnumberofobservationssupportedbyasingle partner, the approximation ratio for k-SEP using a greedy-scheme after a reduction fromset-coveris 1+ln(f). The NAIVE-KSEP-SC algorithmbelowleveragestheabove reductiontosolvethek-SEPproblem. PROPOSITION 4.3. NAIVE-KSEP-SC has a complexity of O((cid:7) · f · |O|2) and an approximationratioof1+ln(f). PROPOSITION 4.4. A solution E to NAIVE-KSEP-SC provides a partner to every observationinO ifapartnerexists–otherwise,itreturnsIMPOSSIBLE. ThealgorithmNAIVE-KSEP-SCisanaive,straight-forwardapplicationoftheO(|E|· |F|·size)greedyapproachforset-coveraspresentedinPaschos[1997]. Wenotethatit ispossibletoimplementaheaptoreducethetime-complexitytoO((cid:7)·f·|O|·lg((cid:7)·|O|)), avoidingthecostofiteratingthroughallpossiblepartnersintheinner-loop. ACMTransactionsonIntelligentSystemsandTechnology,Vol.3,No.1,Article7,Publicationdate:October2011. 7:10 P.Shakarianetal. Algorithm3(KSEP-TO-DOMSET) INPUT:SpaceS,asetO ofobservations,afeasibilitypredicatefeas, andrealnumbersα ≥ 0, β >0 OUTPUT:GraphGO foruseinaninstanceofaDomSetproblem (1) LetGO =(VO,EO)beagraph. SetVO =S and EO =∅. (2) Let Sbea mappingdefined as S: S → VO. In words, Stakes elements of thespaceand returnsnodesfrom GO asdefinedinthefirststep. Thismappingdoesnotchangeduring thecourseofthealgorithm. (3) Foreacho ∈Odothefollowing i (a) Determineallpoints p∈Sthataresuchthatα≤d(o,p)≤β. Callthisset P i (b) Forall p∈ P calculatefeas(p). Iffeas(p)=FALSE,remove pfrom P. i i (c) LetVi={v∈VO|∃p∈ PisuchthatS(p)=v}. (d) Add|Pi|newnodestoVO. AddthesenodestoViaswell. (e) Foreverypairofnodesv1,v2 ∈Vi,addedge(v1,v2)to EO. (4) Removeallv∈VO wheretheredoesnotexistanv(cid:7) suchthat(v,v(cid:7))∈ EO (5) Ifany Pi≡∅returnIMPOSSIBLE.OtherwisereturnGO. In additionto thestraightforwardgreedyalgorithmforset-covering,thereare sev- eralotheralgorithmsthatprovidedifferenttimecomplexity/approximationratiocom- binations. However,withareductiontotheset-coveringproblemwemustconsiderthe resultofLund andYannakakis [1994], whichstates thatset-covercannotbe approxi- matedwithinaratioc·log(n)foranyc<0.25(wherenisthenumberofsubsetsinthe family F)unless NP⊆ DTIME[npoly log n]. A reduction to set-covering has the advantage of being straightforward. It also al- lows us to leverage the wealth of approaches developedfor this well-known problem. Inthenextsection,weshowthatk-SEPreducestothedominatingsetproblemaswell. Wethenexplorealternateapproximationtechniquesbasedonthisreduction. 4.3 AnExactDominatingSetBasedApproach We show below that k-SEP also reduces to the well known dominating set problem (DomSet) [Garey and Johnson 1979] allowing us to potentially leverage fast algo- rithmssuch astherandomized-distributedapproximationschemein Jiaetal.[2002]. DomSetisdefinedasfollows. DominatingSet(DomSet) INPUT:Graph G =(V,E)andpositiveinteger K ≤|V|. OUTPUT: “Yes” if there is a subset V(cid:7) ⊂ V such that |V(cid:7)| ≤ K and such that every vertexv ∈V −V(cid:7) isjoinedtoatleastonememberofV(cid:7) byanedgein E. As the dominating set problem relies on finding a certain set of nodes in a graph, then, unsurprisingly, our reduction algorithm, Algorithm 3, takes space S, an observationset O, feasibility predicate feas, and numbersα,β and returnsgraph GO basedonthesearguments. Wenowpresentanexampletoillustratetherelationshipbetweenadominatingset of size k in GO and a k-sized simple (α,β) explanation for O. The following example illustratestherelationshipbetweenak-SEPproblemandDomSet. Example 4.2. Consider the serial killer scenario in Example 2.4, pictured in Figure 2(right). Supposewe wantto solve this problemas an instance of k-SEP by a reductionto DomSet. We wantto finda1-sizedsimple (α,β) explanation(safe-house) ACMTransactionsonIntelligentSystemsandTechnology,Vol.3,No.1,Article7,Publicationdate:October2011.

Description:
world data set consisting of insurgent IED (improvised explosive device) attacks against U.S finite set of points in S, and α ≥ 0, β > 0 are some real numbers. intuition was to use a Fibonacci heap [Fredman and Tarjan 1987].
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.