Sampling a Network to Find Nodes of Interest PivithuruWijegunawardana1,VatsalOjha2,RaluccaGera3∗,andSucheta Soundarajan1 1SyracuseUniversity,DepartmentofElectricalEngineering&ComputerScience, ppwijegu,[email protected] 2DoughertyValleyHighSchool,SanRamon,California,[email protected] 3NavalPostgraduateSchool,DepartmentofAppliedMathematics,Monterey,CA, [email protected] Abstract. Thefocusofthecurrentresearchistoidentifypeopleofinterestin 7 socialnetworks.Weareespeciallyinterestedinstudyingdarknetworks,which 1 representillegalorcovertactivity.Insuchnetworks,peopleareunlikelytodis- 0 closeaccurateinformationwhenqueried.WepresentREDLEARN,analgorithm 2 forsamplingdarknetworkswiththegoalofidentifyingasmanynodesofinterest n aspossible.Weconsidertworealisticlyingscenarios,whichdescribehowindi- a vidualsinadarknetworkmayattempttoconcealtheirconnections.Wetestand J present our results on several real-world multilayered networks, and show that 9 REDLEARNachievesuptoa340%improvementoverthenextbeststrategy. ] I Keywords: multilayerednetworks,samplingmethods,lyingscenarios,nodesof S interest . s c [ 1 IntroductionandMotivation 1 v Theadventofbigdataintoday’scomplexenvironmentrequiresdecisionmakerstoact 8 in an overwhelmingly rich environment (or network) based on partial information of 9 thatnetwork.Usinganobservedportionofsomenetwork,itisoftendesirabletomake 2 inferencesaboutwhichunobservedareasofthenetworktoexplorenext.Often,itisof 2 particularinteresttolocate“peopleofinterest”(POI)residinginthesenetworks.These 0 . individualsmaybetryingtoconcealthemselves,ormaybeprotectedbyothers. 1 Our work was motivated by study of terrorist networks. These networks typically 0 include several types of relationships connecting people, producing multilayered net- 7 1 workswhereeachlayerisdefinedbyadifferentrelationship.Forexample,oneofthe : networks contains data depicting relationships among terrorists in the ’Noordin Top’ v NetworkinIndonesia,whererelationshipsareorganizationstheseterroristsbelongto, i X theschoolsortrainingstheywentto,kinship,recruitingandsoon.Thedatawascol- r lectedbyEvertonet.al[19]andcompiledintoanetworkbyGeraetal.in[11]. a It is generally desired to find those POIs that possess a certain attribute, such as peoplewhoattendedaspecificplanningmeetingorwhowereinvolvedinorganizinga particularattack.Forourspecificwork,intheinsurgentnetworkNoordinTop(seeSec- tion5.1)allthepeoplethatwerecommunicatingusingacertainmediumweretaggedas ∗RaluccaGerawouldliketothanktheDoDforpartiallysponsoringthecurrentresearch. 2 Wijegunawardana,Ojha,Gera,Soundarajan, beingPOI,producingarelativelysmallnumberofPOI.Weassumethatwebeginwith knowledgeofonePOIinthenetwork,withtherestofthenetworkunobserved(bothin termsoftopologyaswellasnodeattributes).Ourgoalistosamplethenetworkinsuch a way that we observe as many POIs as possible. To our knowledge, we are the first toconsidertheproblemofsamplingwiththegoalofidentifyingnodesofinterestina settingwithmisinformation. Wepresent REDLEARN,anovel,learning-basedalgorithmforsamplingnetworks withthegoaloffindingasmanyPOIsaspossible.WeshowthatincaseswherethePOIs exhibithomophily(i.e.,arelikelytobeconnectedtootherPOIs),asimplestrategyof choosingthenodewiththemostPOIneighborsworkswell.However,inthemorere- alisticscenariowherePOIshidetheirconnectionswithotherPOIs,REDLEARNshows outstandingperformance,beatingthenextbeststrategybyupto340%. 2 ProblemDefinition WerefertonodesrepresentingPOIsas‘red’nodes,andothernodesas‘blue’,givingus apurplenetwork.Weassumethatthereisanunobserved,underlyinggraphG=(V,E), inwhicheachnodev∈V hascolorC ∈{red,blue}.Webeginwithhavingknowledge v ofonlyonerednodeinG. Toincreaseourobservationofthenetwork,weplacemonitorsonnodes.Amonitor tellsus(1)thetruecolorofthenodebeingplacedon,(2)thetrueneighborsofthatnode, and (3) the colors of the node’s neighbors, possibly with inaccuracies. For example, placing a monitor on a suspected terrorist could represent determining whether that personisactuallyaterrorist,determiningwhohisorhere-mailorphonecontactsare, andquestioningtheindividualaboutwhetherthoseneighborsarethemselvesterrorists. Naturally,someindividualsmaylieaboutthecolorsofthoseneighbors.1 Weassumethatwearegivenabudgetofbmonitors,andcanplacethosemonitors on any node that has been observed. In the first step, we must place a monitor on the initiallyobservednode,becausenoothernodeswereobserved.Insubsequentsteps,we mayplaceamonitoronanynodethathasbeenobservedasaneighborofapreviously- monitorednode. Ourgoalistomaximizethetotalnumberofrednodesobserved. 3 RelatedWork Ourworkhereisrelatedtoworkonanalyzingdarknetworks,aspecialtypeofsocial network[5].Adarknetworkisnetworkthatisillegalandcovert[18],whosemembers areactivelytryingtoconcealnetworkinformationevenattheexpenseofefficiency[5], and the existing connections are used infrequently [18]. Because a dark network is deceptive by nature, we examine the lying methodologies along with the discovery methodsinlookingforthePOI. There are a multitude of sampling techniques for network exploration, including random walks ([3], [13], [17]), biased random walks ([10]), or walks combined with 1Weconsidertworealistic’lyingscenarios’;thesearedescribedinSection4.2. SamplingaNetworktoFindNodesofInterest 3 reversible Markov Chains([2]), Bayesian methods([9]), or standard exhaustive search algorithmslikedepth-firstorbreadth-firstsearches,suchas [1,6,7,8,14].However, thesemethodsfailinusingdiscoveredknowledge,suchasnodeattributes,effectively. Various researchers have considered the problem of sampling for specific goals, such as maximizing the number of nodes observed. For example, Avrachenkov, et al. present an algorithm to sample the node with the highest estimated unobserved de- gree [4]. Hanneke and Xing [12], and Maiya and Berger-Wolf [16] examine online samplingforcentralitymeasures.MacskassyandProvostdevelopaguilt-by-association methodtoidentifysuspiciousindividualsinapartially-knownnetwork[15]. 4 ProposedMethod: REDLEARN Amonitorplacementstrategyisanincrementalsamplingstrategy.Ineachstep,ituses current knowledge of the discovered graph, including the observed topology of the graph, known colors of nodes (observed by monitors placed directly on those nodes), andthestatedcolorsofmonitorednodes’neighbors(i.e.,foreachneighborofamoni- torednode,whetherthemonitorednodesaidthatthatneighborwasredorblue).With this information, the strategy determines where to place the next monitor to discover themostnumberofrednodes.Thiscontinuesuntilthemonitorbudgetisexhausted. monitorednodeisanodewithamonitorplacedonit.Amonitorednodeisknown red(orblue).Recallthatamonitorednodemaylieaboutitsneighbors’colors;thusthe colorofanodeisnotknownwithcertaintyuntilitismonitored.LetN(v)refertothe setofneighborsofv. 4.1 BaselineMonitorPlacementStrategies Wenowdescribeseveralnaturalmonitorplacementstrategies.Weusethesestrategies ascomparisonalgorithmsinourexperiments;additionally,someofthesestrategiesare usedindevelopingourlearningstrategyREDLEARN. SmartRandomSampling(SR):Ineachstep,theSmartRandomPlacementstrat- egy places monitors on a random unmonitored node. Due to the inefficiency of using noinformationaboutthenetworkformonitorplacement,thisisusedasalowerbound placementstrategyfortheintroducedmethodologies. RedScore(RS):TheRedScorestrategyisguidedbythecolorsreportedbyneigh- borsofanode.Ifanodevreportsitsneighboruasred,thescoreassociatedwithnode u is increased by one, making it more suspicious. This strategy selects the node with highest red score to place the next monitor. For this method, the red score is highly impactedbytheaccuracyofinformationgivenbytheneighboringnode.Additionally, duetoitsuseofbothredandbluenodeinformation,thisstrategyusesthemostamount ofinformationascomparedtotheotherbaselinestrategies. Most Red Say Red (MRSR): The MRSR strategy places a monitor on the node withthegreatestnumberofredneighborswhoreportitasarednode.Itdoesnotfactor in blue node information and is dependent solely on the accuracy of the information given by neighboring red nodes. Blue nodes are essentially useless in this strategy, mimicking the reality when they might not know who the POIs are. This placement 4 Wijegunawardana,Ojha,Gera,Soundarajan, strategywouldresultinarednodewithnoredneighborsbeingimpossibletodiscover exceptbychance. Most Red Neighbors (MRN): The MRN placement strategy places a monitor on thenodewiththemostknownredneighbors.Thisstrategyignoreswhatthemonitored nodes say about their neighbors. This strategy would likely work best in a network withhighhomophily.SimilartotheMRSRstrategy,blueneighborsareunimportantin determiningthelikelihoodofagivennodebeingred. 4.2 REDLEARN:ALearningBasedMonitorPlacementStrategy Whendeterminingwhichnodevtoplacethenextmonitoron,foreachcandidatenode v (i.e., each observed but unmonitored node), the strategies above consider the colors ofv’sneighborsand/orthecolorthateachofv’smonitoredneighborsreported. Intuitively, if the original network displays homophily, the probability of node v being a red node is higher if it has many red neighbors. But if the network does not display homophily, using this measure can result in poor performance. Note that we expectmanydarknetworkstoexhibitloworevenanti-homophily,asmaliciousactors arelikelytogooutoftheirwaytoconcealtheirconnections. Additionally, some monitor placement strategies like red score, depend on infor- mation given by neighbors. The performance of such monitor placement criteria thus heavilydependonthepoliciesthatnodesfollowwhenstatingtheirneighbors’colors. Toovercomethesedependencies,weproposeREDLEARN,alearningbasedmonitor placementstrategy.Ourgoalistosuccessfullypredicttheprobabilityofanodevbeing red (P(v=R)) based on the observed network structure and what v’s neighbors say aboutv.Wemodelthisasatwoclassclassificationproblem,butratherthanlookingat theassignedlabel(RedorBlue),wearemoreinterestedinfindingP(v=R).Oncethese probabilitiesaredetermined,REDLEARNplacesthenextmonitoronthenodewiththe highestsuchprobability. Features: Table 1 describes the set of features used in our learning based moni- torplacementalgorithm.Therearetwotypesoffeatures:(a)Networkstructure-based features(1,2,3),and(b)Neighboranswer-basedfeatures(4,5,6,7,8). Table1:ClassificationfeaturesforREDLEARN.ConsideranodevwithneighborsN(v) Feature Description (1) NumberofRedNeighbors |{u∈N(v)|cu=R}| (2) NumberofBlueneighbors |{u∈N(v)|cu=B}| (cid:12) (cid:12) (cid:12) (cid:12) (3) NumberofRedtrianglesifvisred (cid:12){u,w∈N(v)|u∈N(w)∩w∈N(u)∩cu=cw=R}(cid:12) (cid:12) (cid:12) (4) Redscore |{u∈N(v)|(usaysR)}| (5) NumberofRedneighborssayingred |{u∈N(v)|(usaysR)∩cu=R}| (6) Numberofredneighborssayingblue |{u∈N(v)|(usaysB)∩cu=R}| (7) Numberofblueneighborssayingred |{u∈N(v)|(usaysR)∩cu=B}| (8) Numberofblueneighborssayingblue |{u∈N(v)|(usaysB)∩cu=B}| (9) Inferredprobabilityofbeingred PI(v=R) SamplingaNetworktoFindNodesofInterest 5 Networkstructure-basedfeaturesareusedtolearnthepatternsofconnectionsbetween red nodes (e.g., homophily vs. anti-homophily). Neighbor answer-based features are intendedtolearntherelationshipbetweenwhatanodesaysaboutitsneighbors’colors andthetruecolorsofthoseneighbors. Inferred probability of being red: We formulate four different probabilities to measurethetrustworthinessofcolorsgivenbydifferentlycolorednodes(i.e.,whether amonitorednodeliesorishonestaboutitsneighbors’colors).Weusetheinformation collectedpriortoplacingamonitoronanodetocalculatetheseprobabilities.Consider anodevwhichwasdiscoveredthroughamonitorplacedonnodeu. |{(v=R)∩(u=R)∩(usaysR)}| 1. P(v=R|(u=R)∧(uSaysR))= |{(u=R)∩(usaysR)}| |{(v=R)∩(u=R)∩(usaysB)}| 2. P(v=R|(u=R)∧(uSaysB))= |{(u=R)∩(usaysB}| |{(v=R)∩(u=B)∩(usaysR)}| 3. P(v=R|(u=B)∧(uSaysR))= |{(u=B)∩(usaysR)}| |{(v=R)∩(u=B)∩(usaysB)}| 4. P(v=R|(u=B)∧(uSaysB))= |{(u=B)∩(usaysB)}| Givenanodev,wecalculatetheinferredprobability,PI(v=R)usingequation1. ∑ P(v=R|color(u)∧color(usaysv)) PI(v=R) = u∈N(v) (1) |N(v)| TrainingData:Supposethatwehaveplacednmonitorssofar.Thenthetrainingset consistsofthetruecolorsofthenmonitorednodesalongwiththeirrespectivefeature values.Todeterminewheretoplacethe(n+1)st monitor,wetrainthelearningmodel usingthisdata. ClassificationAlgorithm:OurgoalisfindingP(v=R)foreachunmonitorednode v.Becausewearepredictingaprobabilityratherthanabinarylabel,weproposeusinga logisticregressionclassifier.Furthermore,becausethelearningmodelmustbeupdated frequently,thisclassifiergivesanaddedadvantageoffastertraining. Placing the Next Monitor (Prediction): Given the placement of n monitors and deciding to place the (n+1)st monitor, REDLEARN considers all unmonitored nodes discoveredsofar.Next,itcalculatesfeaturevectorsassociatedwiththesenonmonitor nodes,andappliestheclassifiertothesefeaturevectors,givingtheprobabilitythateach unmonitorednodeisred.REDLEARNselectsthenodewiththehighestprobabilityfor placingthenextmonitor.Algorithm1summarizesREDLEARN. 5 ExperimentalSetUp In section 5.1, we give a description of our network datasets, and then consider them withouthomophilyasdescribedinSection5.2.InSection5.3,weintroducetwolying scenariostomodelthelyingbehaviorofanode(i.e.,whetheritsaysitsneighborsare redorblue).Finally,wedescribeourexperimentalsetup. 6 Wijegunawardana,Ojha,Gera,Soundarajan, Algorithm1Learningbasedmonitorplacement procedureLEARNING(start,budget) G←Graph G.add(start),G.add(N(start)) (cid:46)Startingnodeandneighbors whilebudget>0do Monitors←listofmonitorednodesinG TrainingData←featurevectorsforMonitors TrainclassifierusingTrainingData NotMonitors←listofnotyetmonitorednodesinG forv∈NotMonitorsdo Getfeaturevectorforv P(v=R)←predictv’sprobabilityofRedusinglearningmodel ChoosenodevwithmaximumP(v=R)fromNotMonitors budget←(budget−1) Usevasnextmonitor 5.1 Datasets Noordin Top Network: The first network studied is Noordin Top, a relatively small, but real network with 139 nodes and 1042 edges depicting several types of relation- shipsbetweenthem(‘NoordinTop’isthenameoftheleaderofthisnetwork).2 Inthis network,everynodeisaterrorist,andweareinterestedinidentifyingthoseindividu- alsthatcommunicateusingacertainmedium.Weconsider5versionsofthisnetwork. InNoordinComs1,thePOI(rednodes)arethosecommunicatingusingageneralcom- putermedium;inNoordinComs2,therednodesarethosewhocommunicateusingprint media;inNoordinComs3,therednodesarethosewhocommunicateusingsupportma- terials; in NoordinComs4, the red nodes are those who communicate using unknown media;inNoordinComs5,therednodesarethosewhocommunicateusingvideo(Ta- ble2). Table2:NoordinTopnetworkoverview Networkname RedNodeCount DegreesofRedNodes NoordinComs1 9 8,12,20,21,33,38,50 NoordinComs2 5 8,21,38,38,50 NoordinComs3 9 11,12,21,38,39,50,52 NoordinComs4 18 17,21,24,27,31,38,40,45,52,53,58 NoordinComs5 11 0,9,21,33,38,41,50,52 PokeCNetwork:ThePokeCnetworkispartofaSlovenianon-linesocialnetwork.3 Thenodesinthenetworkareusersofthesocialnetworkandedgesdepictsfriendship relations. Each node has some number of associated user attributes (e.g., age, region, 2Obtained from https://sites.google.com/site/sfeverton18/ research/appendix-1. 3Obtainedfromhttp://snap.stanford.edu/data/. SamplingaNetworktoFindNodesofInterest 7 gender,interests,heightetc.).Weuseasampleofthisnetworkcontainingallnodesin the region ”kosicky kraj, michalovce” and edges among them. This sampled network contains26,220nodesand241,600edges. Weassignnodecolorsbasedontwodifferentnodeattributes:age(anodewithage intherange28-32ismarkedred,andblueotherwise,giving1736rednodes)andheight (auserofheightlessthan160cmismarkedred,giving1668rednodes). 5.2 EliminatingHomophily Inbothnetworksthatweconsider,rednodestendtobeconnectedtoeachother.How- ever,inadarknetworkwhererednodesareactivelytryingtohidetheirpresence,these nodeswouldconcealtheexistenceofsuchconnections(forexample,insteadofusing theirnormalcellphonetomakecallstootherrednodes,arednodemightuseaburner phoneforsuchcalls).Toaccountforthis,weconsiderversionsofourdatasetswhere allconnectionsbetweenrednodesareremoved.Notethatthistypeofnetworkpresents a much more challenging setting, as one cannot simply rely on homophily to find red nodes. 5.3 LyingScenarios Recall that a monitor tells us the true color of the monitored node and the possibly incorrectcolorsofthatnode’sneighbors.Becausewedonothavedatadescribinghow terroristslietopreventdetectionofotherterrorists,wemustsimulatethisbehavior. Informulatingtheselyingscenarios,weassumetheexistenceofahierarchyamong thenodes,wherenodesaremorelikelytolietoprotectthoseabovetheminthehierar- chy.Weassumethattherednodesarefullyawareofthehierarchy,butbluenodesmay ormaynotbeaware,dependingonthescenario. Inboth lyingscenarios, weassume thatnodesmay lienot onlyabout thecolorof rednodes(i.e.,lietoprotectPOIs),butalsoaboutthecolorofbluenodes(i.e.,framing innocentindividualsasadistraction).Withoutthisassumption,theproblemwouldbe trivial,becauseanytimeanyindividualsaidanodewasred,wewouldknowwithfull certaintythatthatnodeisactuallyred. Consider nodes u and v, where u,v∈Edges. The probability that u lies about v, P(uliev)dependson: – Thecolorofu(C )andcolorofv(C ). u v – The inherent honesty of u (H ), where higher H values indicate that u is more u predisposedtotellingthetruth. – Thehierarchicalpositionofu(L )relativetothepositionofv(L ). u v Suppose u is a red node. In all lying scenarios, the probability that u lies about anothernodevisgivenbythefollowingequations: Equation2determinestheprobabilityuwilllieaboutarednode. Lv indicateshow Lu farabovevisinthehierarchycomparedtouand1−H isprobabilitythatuwilllie. u L v P(uliev|v=Red)=min{(1−H )∗ ,1} (2) u L u 8 Wijegunawardana,Ojha,Gera,Soundarajan, Equation3definestheprobabilityuwilllieaboutabluenode.Thisdependsonu’s honestyandiscalculatedas(1−H ). u P(uliev|v=Blue)=(1−H ) (3) u Now suppose that u is a blue node. u may know nothing about red nodes. This depends largely on whether the blue nodes are part of the same organization as the red nodes, but are simply not of interest to the user (e.g., blue and red nodes are all terrorists, and red nodes represent a subset of interest), or if the blue nodes represent individuals who are not part of the same organization as the red nodes (e.g., the red nodesareterroristsinaseaofbluenodecivilians). Table3:NoordinTopnetworkhierarchyassignment Role Hierarchyscore No.ofnodes Strategist 5 10 Commander;ReligiousLeader 4 23 Trainer/instructor;Bombmaker;Facilitator;Propagandist;Recruiter 3 33 Bomber/fighter;SuicideBomber;Courier;Recon/Surveillance 2 33 Unknown 1 40 – Lying scenario-1 (LS1): Blue nodes know about red nodes. Here, P(uliev|u= Blue,v=Red) is determined using equation 2, since blue nodes know about red nodesandtheirhierarchy.Additionally,bluenodesmaylieaboutotherbluenodes. P(uliev|u=Blue,v=Blue)isthuscalculatedusingequation 3. – Lyingscenario-2(LS2):Bluenodesdon’tknowaboutrednodes.Here,bluenodes will simply say that all their neighbors are blue. Because of this P(u lie v|u = Blue,v=Blue)=0andP(uliev|u=Blue,v=Red)=1 Inallcases,ifP(uliev)isgreaterthan1,itisroundeddownto1. 5.4 Experiments Sinceourlyingscenariosareprobabilistic,thecolorsthatnodessayaboutneighborscan changefromonerunofthealgorithmtoanother.Additionally,thehonestyassignment to a node also can change from one run to another. Thus, for each network and lying scenario,weperform25runsofeachmonitorplacementstrategy. In each run, we begin with a randomly selected red node. For a fair comparison acrossdifferentmonitorplacementstrategies,wemakesurethatweruneachmonitor placementstrategywiththesamesetsofstartingnodesandhonestyassignments. In these experiments, the honesty of a node is drawn from a normal distribution, h ∼ N (0.5,0.125). In the Noordin Top network, ground truth hierarchy scores are known,andshowninTable3.InthePokeCnetwork,wesetthehierarchyscoretobe thedegreeofthenode.Givenaparticularlyingscenario,amonitorednodeuliesabout aneighborv’scolorwithprobabilityP(uliev)asgiveninSection5.3. In each run, we consider budgets up to half the number of nodes in the network. TheNoordinTopnetworkissmall,andsoweretrainREDLEARNaftereachmonitoris placed.ThePokeCnetworkislarger,soforthesakeofefficiency,wetrainthelearning modelonceperevery20monitorsplaced. SamplingaNetworktoFindNodesofInterest 9 6 ResultsandAnalysis Inthissection,wecomparetheresultsbasedonthevariousmonitorplacementstrate- gies, and examine how performance is affected by (1) the presence or absence of ho- mophily,(2)thelyingscenariousedbythenodes,and(3)themonitorplacementbudget. As an example, Figure 1 shows results on the NoordinComs4 network with and withoutedgesbetweenrednodes,respectively.Whenthereishomophily,theproblem becomeseasy,andthesimplestrategyofmonitoringthenodewiththemostredneigh- bors (MRN) is best. However, note that in both lying scenarios, REDLEARN is close behind the MRN strategy (because it needs time to train, it doesn’t quite match the performanceofMRN). Homophilyispresent(originalnetwork) d d un15 un15 o o es f es f d d o o n n d 10 d 10 e e R R of of er er b5 b5 m m u u N N 0 20 40 60 0 20 40 60 Number of Monitors placed Number of Monitors placed NoHomophily(edgesbetweenrednodesareremoved) d d un15 un15 o o es f es f d d o o n n d 10 d 10 e e R R of of er er b5 b5 m m u u N N 0 20 40 60 0 20 40 60 Number of Monitors placed Number of Monitors placed (a)LS1:Allnodesawareofrednodes. (b)LS2:Onlyrednodesawareofrednodes. Fig.1:ComparisonofmonitorplacementstrategiesontheNoordinComs4network.The blacklineindicatesthetotalnumberofrednodespresentinthenetwork. However, we see from the bottom figures that when edges between red nodes are removed,theMRNstrategyperformsverypoorly.Inthissetting,REDLEARNperforms muchbetterthanallcomparisonmethods:itisabletolearnthepatternsandstructural 10 Wijegunawardana,Ojha,Gera,Soundarajan, characteristics of red nodes, and by incorporating what neighbors say about a node, achievesstrongperformance. Duetospaceconstraints,wesummarizeresultsfortheothercasesinTables4and5. Weseesimilarpatternsacrossallnetworks:whenthereareedgesbetweenrednodes, it is enough to select the node with the most red neighbors; but when these edges are concealed,REDLEARNistheclearwinner. Even when there are edges between red nodes, REDLEARN usually achieves per- formance close to the MRN strategy. There are some exceptions, such as the No- ordinComs2network.Thistypicallyoccursifthereareveryfewrednodes:forinstance, NoordinComs2,only5outof139nodesarered.Thereissimplynotenoughinforma- tionforREDLEARNtotrainon. Thisanalysisshowsthatperformanceoftheproposedalgorithmdoesnotrelyona particularnetworkstructure,orwhichlyingscenariopeoplemightuse.Thereforeusing learningbasedmonitorplacementisguaranteedtogivebetterperformance,especially inrealnetworks,whicharelarge. Table4:Comparisonofthepercentageofrednodesfoundfromeachmonitorplacement strategy. Budgets include Low (10% of the nodes), Medium (25% of the nodes), and High(50%ofthenodes).Thesenetworksexhibithomophily:edgesbetweenrednodes havenotbeenremoved. (a)LyingScenario1 LowBudget MediumBudget HighBudget Network/ RS RDLRNMRNMRSRSR RS RDLRNMRNMRSRSR RS RDLRNMRNMRSRSR Strategy NrdnComs1 28 74 97 52 32 43 97 100 77 51 97 100 100 92 75 NrdnComs2 42 37 62 48 42 61 72 100 72 55 99 93 100 91 81 NrdnComs3 33 63 83 52 27 59 89 100 77 46 100 100 100 97 73 NrdnComs4 54 60 70 66 28 63 98 100 90 48 100 100 100 100 76 NrdnComs5 34 67 84 52 32 43 91 91 75 46 88 91 91 86 67 PokeCage 5 14 22 20 7 15 43 47 39 21 48 73 68 62 47 PokeCheight 14 14 21 23 11 36 32 48 47 28 74 64 73 69 54 (b)LyingScenario2 LowBudget MediumBudget HighBudget Network/ RS RDLRNMRNMRSRSR RS RDLRNMRNMRSRSR RS RDLRNMRNMRSRSR Strategy NrdnComs1 57 78 89 57 33 82 100 100 79 47 96 100 100 95 73 NrdnComs2 56 54 83 55 38 83 66 99 82 52 94 89 100 94 74 NrdnComs3 50 70 76 52 34 77 85 100 80 50 99 97 100 99 77 NrdnComs4 68 62 74 67 28 92 100 100 91 50 98 100 100 97 79 NrdnComs5 59 64 88 59 32 79 91 91 79 50 89 91 91 90 74 PokeCage 20 12 22 19 7 39 33 47 39 21 62 60 68 62 46 PokeCheight 23 12 21 23 12 46 29 48 46 28 69 62 73 69 54