1 On Quality of Monitoring for Multi-channel Wireless Infrastructure Networks Huy Nguyen, Student Member, IEEE, Gabriel Scalosub and Rong Zheng, Senior Member, IEEE Abstract—Passivemonitoringutilizingdistributedwirelesssniffersisaneffectivetechniquetomonitoractivitiesinwirelessinfrastruc- turenetworksforfaultdiagnosis,resourcemanagementandcriticalpathanalysis.Inthispaper,weintroduceaqualityofmonitoring (QoM)metricdefinedbytheexpectednumberofactiveusersmonitored,andinvestigatetheproblemofmaximizingQoMbyjudiciously assigning sniffers to channels based on the knowledge of user activities in a multi-channel wireless network. Two types of capture modelsareconsidered.Theuser-centric modelassumesframe-levelcapturingcapabilityofsnifferssuchthattheactivitiesofdifferent userscanbedistinguishedwhilethesniffer-centric modelonlyutilizesthebinarychannelinformation(activeornot)atasniffer.For theuser-centricmodel,weshowthattheimpliedoptimizationproblemisNP-hard,butaconstantapproximationratiocanbeattained 3 viapolynomialcomplexityalgorithms.Forthesniffer-centricmodel,wedevisestochasticinferenceschemestotransformtheproblem 1 intotheuser-centricdomain,whereweareabletoapplyourpolynomialapproximationalgorithms.Theeffectivenessofourproposed 0 schemesandalgorithmsisfurtherevaluatedusingbothsyntheticdataaswellasreal-worldtracesfromanoperationalWLAN. 2 IndexTerms—Wirelessnetwork,mobilecomputing,approximationalgorithm,binaryindependentcomponentanalysis. n a (cid:70) J 1 3 ] 1 INTRODUCTION ment wire side monitoring using SNMP and basestation I N logs since it reveals detailed PHY (e.g., signal strength, Deployment and management of wireless infrastructure spectrum density) and MAC behaviors (e.g, collision, s. networks (WiFi, WiMax, wireless mesh networks) are retransmissions), as well as timing information (e.g., c often hampered by the poor visibility of PHY and [ backoff time), which are often essential for wireless MACcharacteristics,andcomplexinteractionsatvarious diagnosis. The architecture of a canonical monitoring 1 layers of the protocol stacks both within a managed systemconsistsofthreecomponents:1)snifferhardware, v network and across multiple administrative domains. 6 2) sniffer coordination and data collection, and 3) data In addition, today’s wireless usage spans a diverse set 9 processing and mining. of QoS requirements from best-effort data services, to 4 Depending on the type of networks being monitored 7 VOIP and streaming applications. The task of managing and hardware capability, sniffers may have access to . the wireless infrastructure is made more difficult due 1 different levels of information. For instance, spectrum 0 to the additional constraints posed by QoS sensitive analyzers can provide detailed time- and frequency- 3 services. Monitoring the detailed characteristics of an domain information. However, due to the limit of band- 1 operational wireless network is critical to many system width or lack of hardware/software support, it may not : v administrative tasks including, fault diagnosis, resource be able to decode the captured signal to obtain frame i management,andcriticalpathanalysisforinfrastructure X level information on the fly. Commercial-off-the-shelf upgrades. networkinterfacessuchasWiFicardsontheotherhand, r a Passive monitoring is a technique where a dedicated can only provide frame level information1. The volume set of hardware devices called sniffers, or monitors, are of raw traces in both cases tends to be quite large. For used to monitor activities in wireless networks. These example, in the study of the UH campus WLAN, 4 devices capture transmissions of wireless devices or million MAC frames have been collected per sniffer per activities of interference sources in their vicinity and channel over an 80-minute period resulting in a total of store the information in trace files, which can be an- 8milliondistinctframesfromfoursniffers.Furthermore, alyzed distributively or at a central location. Wireless duetothepropagationcharacteristicsofwirelesssignals, monitoring[1],[2],[3],[4],[5]hasbeenshowntocomple- a single sniffer can only observe activities within its vicinity. Observations of sniffers within close proxim- H. Nguyen is with the Department of Computer Science, University of ity over the same frequency band tend to be highly Houston,Houston,TX77204(e-mail:[email protected]). G. Scalosub is with the Department of Communications System Engi- correlated. Therefore, two pertinent issues need to be neering, Ben-Gurion University of the Negev, Beer-Sheva, Israel (e-mail: addressed in the design of passive monitoring systems: [email protected]). 1)whattomonitor,and2)howtocoordinatethesniffers R. Zheng is with the Department of Computing and Software, McMaster University, Ontario, Canada (e-mail: [email protected]). This work was conductedwhentheauthorwaswiththeUniversityofHouston. 1.Certain chip sets and device drivers allow inclusion of header AnearlierversionofthisworkappearedintheProceedingsofThe11thACM fields to store a few physical layer parameters in the MAC frames. International Symposium on Mobile Ad Hoc Networking and Computing However, such implementations are generally vendor and driver de- (Mobihoc),pp.111–120,2010. pendent. 2 to maximize the amount of captured information. withtheparametersproperlychosen.Thus,effectiveand This paper assumes a generic architecture of pas- efficientalgorithmsforthesnifferassignmentproblemis sive monitoring systems for wireless infrastructure net- critical. In this paper, we focus on designing algorithms works, which operate over a set of contiguous or non- that aim at maximizing the QoM metric with different contiguouschannelsorbands2.Toaddressthefirstques- granularities of a priori knowledge. The usage patterns tion, we consider two categories of capturing models areassumedtobestationaryduringthedecisionperiod. differed by their information capturing capability. The OurContribution: Inthispaper,wemakethefollowing firstcategory,calledtheuser-centricmodel,assumesavail- contributions toward the design of passive monitoring ability of frame-level information such that activities of systems for multi-channel wireless infrastructure net- differentuserscanbedistinguished.Thesecondcategory works is the sniffer-centric model which only assumes binary information regarding channel activities, i.e., whether • We provide a formal model for evaluating the qual- some user is active in a specific channel near a sniffer. ity of monitoring. Clearly, the latter imposes minimum hardware require- • We study two categories of monitoring models that differintheinformationcapturingcapabilityofpas- ments, and incurs minimum cost for transferring and sive monitoring systems. For each of these models storing traces. In some cases, due to hardware con- we provide algorithms and methods that optimize straints (e.g., in wide-band cognitive radio networks) or the quality of monitoring. security/privacy considerations, decoding of frames to extractuserlevelinformationisinfeasibleandthusonly • Weunravelinteractionsbetweenthetwomonitoring models by devising two methods to convert the binarysnifferinformationmightbeavailableforsurveil- sniffer-centric model to the user-centric domain by lance purpose. We further characterize theoretically the exploiting the stochastic properties of underlying relationship between the two models. user processes. Ideally, a network administrator would want to per- form network monitoring on all channels simultane- More specifically, we show that in both the user- and ously. However, multi-radio sniffers are known to be sniffer-centric models considered, a pure strategy where large and expensive to deploy [6]. We therefore assume a sniffer is assigned to a single channel suffices in sniffers in our system are low-cost devices which can order to maximize the QoM. In the user-centric model, only observe one single wireless channel at a time. we show that our problem can be formulated as a To maximize the amount of captured information, we coveringproblem.TheproblemisproventobeNP-hard, introduce a quality-of-monitoring (QoM) metric defined and constant-approximation polynomial algorithms are as the total expected number of active users detected, provided. With the sniffer-centric model, we show that where a user is said to be active at time t, if it transmits although the only information retrieved by the sniffers over one of the wireless channels. The basic problem is binary (in terms of channel activity), the “structure” underlying all of our models can be cast as finding an of the underlying processes is retained and can be assignment of sniffers to channels so as to maximize the recovered. Two different approaches are proposed that QoM. QoM is an important metric that quantifies the utilize the notion of Independent Component Analysis efficiency of monitoring solutions to systems where it is (ICA) [12] and allow mapping the sniffer assignment important to capture as comprehensive information as problem to the user-centric model. The first approach, possible (e.g.: intrusion/anomaly detection [7], [8] and Quantized Linear ICA (QLICA), estimates the hidden diagnosing systems [9], [10]). structure by applying a quantization process on the out- We note that the problem of sniffer assignment, in an come of the traditional ICA, while the second approach, attempt to maximize the QoM metric, is further compli- Binary ICA (BICA) [13], decomposes the observation catedbythedynamicsofreal-lifesystemssuchas:1)the data into OR mixtures of hidden components and re- user population changes over time (churn), 2) activities covers the underlying structure. Finally, an extensive of a single user is dynamic, and 3) connectivity between evaluationstudyiscarriedoutusingbothsyntheticdata users and sniffers may vary due to changes in channel as well as real-world traces from an operational WLAN. conditionsormobility.Thesepracticalconsiderationsre- The paper is organized as follows. An overview of veal the fundamental intertwining of “learning”, where related work is provided in Section 2. In Section 3, we theusagepatternofwirelessresourcesistobeestimated formally introduce the QoM metric and the user-centric online based on captured information, and “decision and sniffer-centric models for a passive monitoring sys- making”, where sniffer assignments are made based on tem. The NP-hardness and polynomial-time algorithms available knowledge of the usage pattern. In fact, in forthemaximumeffortcoverageproblemthatunderlies our earlier work [11], we prove that during learning, two variants of the user-centric model are discussed each instance of the decision making is equivalent to in Section 4. The relationship between the user-centric solving an instance of the sniffer assignment problem and sniffer-centric models is established in Section 5, wherewealsodescribetwoschemesforsolvingtheQoM problem under the sniffer-centric model. We present the 2.A channel can be a single frequency band, a code in CDMA systems,orahoppingsequenceinfrequencyhoppingsystems. results of the evaluation study using both synthetic and 3 real traces in Section 6. We discuss issues regarding model results in a problem formulation that is similar practical system implementation in Section 7 and finally to (albeit different from) the one addressed in [19]. conclude the paper in Section 8. On one hand, we assume all sniffers may be used for monitoring (hence parting with our problem being akin to the classical maximum-coverage problem, while on 2 RELATED WORK the other hand we focus on the weighted version of the problem, where elements to be covered have weights. In this section, we provide an overview of related work One should note that all the lower bounds mentioned pertaining to wireless network monitoring, and binary in [20], [19] do not apply to our problem. independent component analysis. Binary independent component analysis: Binary ICA Wireless monitoring: There has been much work done is a special variant of the traditional ICA, where linear on wireless monitoring from a system-level approach, in mixing of continuous signals is assumed. In binary ICA, an attempt to design complete systems, and address boolean mixing (e.g., OR, XOR etc.) of binary signals theinteractionsamongthecomponentsofsuchsystems. is considered. Existing solutions to binary ICA mainly The work in [14], [15] uses AP, SNMP logs, and wired differ in their assumptions of prior distribution of the sidetracestoanalyzeWiFitrafficcharacteristics.Passive mixing matrix, noise model, and/or hidden causes. In monitoring using multiple sniffers was first introduced [21], Yeredor considers binary ICA in XOR mixtures by Yeo et al. in [1], [2], where the authors articulate the and investigates the identifiability problem. A deflation advantages and challenges posed by passive measure- algorithmisproposedforsourceseparationbasedonen- ment techniques, and discuss a system for performing tropy minimization. In [21] the number of independent wireless monitoring with the help of multiple sniffers, random sources K is assumed to be known. Further- which is based on synchronization and merging of the more, the mixing matrix is a K-by-K invertible matrix. traces via broadcast beacon messages. The results ob- In [22], an infinite number of hidden causes following tainedforthesesystemsaremostlyexperimental.Rodrig the same Bernoulli distribution is assumed. Reversible et al. in [3] used sniffers to capture wireless data, and jump Markov chain Monte Carlo and Gibbs sampler analyzetheperformancecharacteristicsofan802.11WiFi techniques are applied. In contrast, in our model, the network.Onekeycontributionwastheintroductionofa hiddencausesmayfollowdifferentdistributions.Streith finite state machine to infer missing frames. The Jigsaw etal.[23]studytheproblemofmulti-assignmentcluster- system, that was proposed in [4], focuses on large scale ing for boolean data, where an object is represented by monitoring using over 150 sniffers. a boolean attribute vector. The key assumption made in A number of recent works focused on the diagnosis of this work is that elements of the observation matrix are wirelessnetworkstodeterminecausesoferrors.In[16],Chan- conditionally independent given the model parameters. dra et al. proposed WiFiProfiler, a diagnostic tool that This greatly reduces the computational complexity and utilizes exchange of information among wireless hosts makes the scheme amenable to gradient descent opti- about their network settings, and the health of network mizationsolution;however,theassumptionisingeneral connectivity. Such shared information allows inference invalid. In [24], the problem of factorization and de- of the root causes of connectivity problems. Building noising of binary data due to independent continuous on their monitoring infrastructure, Jigsaw, Cheng et al. sources is considered. The sources are assumed to be [17] developed a set of techniques for automatic char- followingabetadistributionandnotbinary.Finally,[22] acterization of outages and service degradation. They considerstheunder-representedcaseoflesssensorsthan showedhowsourcesofdelayatmultiplelayers(physical sourceswithcontinuousnoise,while[24],[23]dealwith throughtransport)canbereconstructedbyusingacom- the over-determined case, where the number of sensors bination of measurements, inference and modeling. Qiu is much larger than the number of sources. et al. in [18] proposed a simulation based approach to determine sources of faults in wireless mesh networks caused by packet dropping, link congestion, external 3 PROBLEM FORMULATION noise, and MAC misbehavior. 3.1 Notationandnetworkmodel All the afore-mentioned work focuses on building monitoring infrastructure, and developing diagnosis Consider a system of m sniffers, and n users, where techniques for wireless networks. The question of opti- each user u operates in one of K channels, c(u) ∈ K = mally allocating monitoring resources to maximize cap- {1,...,K}. The users can be wireless (mesh) routers, tured information remains largely untouched. In [19], access points or mobile users. At any point in time, a Shin and Bagchi consider the selection of monitoring sniffercanonlymonitorpackettransmissionsoverasin- nodesandtheirassociatedchannelsformonitoringwire- gle channel. We assume the propagation characteristics less mesh networks. The optimal monitoring is formu- of all channels are similar. We represent the relationship latedasmaximumcoverageproblemwithgroupbudget betweenusersandsniffersusinganundirectedbi-partite constraints (denoted MC-GBC), which was previously graph G = (S,U,E), where S is the set of sniffer nodes studied by Chekuri and Kumar in [20]. The user-centric and U is the set of users. Note that G represents a 4 general relationship between the users and sniffers, and TABLE 1: Notations no propagation or coverage model is assumed. An edge m,n, number of sniffers, users, e = (s,u) ∈ E exists between sniffer s ∈ S and user K,T channels, and observations u ∈ U if s can capture the transmission from u, or bi-partite graph representing G equivalently, u is within the monitoring range of s. If user and sniffer adjacency transmissions from a user cannot be captured by any S set of sniffers nodes in G sniffer, the user is excluded from G. For every vertex U set of users in G v ∈ U ∪S, we let N(v) denote vertex v’s neighbors in A set of all possible sniffer-channel assignments G. For users, their neighbors are sniffers, and vice versa. xm×1 vector of mfrobminamrysrnainffdeorsm variables We will also refer to G as the binary m×n adjacency y vector of n binary random variables matrix of graph G. n×1 from n users Wewillconsidersnifferassignmentsofsnifferstochan- Xm×T collection of T observations of x nels, a : S → K. Given a sniffer assignment a, we Yn×T collection of T observations of y considerapartitioningofthesetofsniffersS =(cid:83)K S , Gm×n binary adjacency matrix of G k=1 k p active probability vector of n users 1×n where Sk is the set of sniffers assigned to channel k. We c(u) active channel of user u furtherconsiderthecorrespondingpartitionofthesetof A(u) sniffer assignments that can monitor user u usersU =(cid:83)K U ,whereU isthesetofusersoperating π(a) probability distribution of assignment a k=1 k k in channel k. Let G = (S ,U ,E ) denote the bipartite k k k k subgraph of G induced by channel k. Given any sniffer s, we let Nk(s) = N(s)∩Uk, i.e., the set of neighboring u1 users of s that use channel k. s s u A monitoring strategy determines the channel(s) a snif- 1 2 2 fer monitors. It could be a pure strategy, i.e., the channel a sniffer is assigned to is fixed, or a mixed strategy where sniffers choose their assigned channel in each (cid:20) (cid:21) 1 0 slot according to a certain distribution. Formally, let G= 1 1 A = {a|a:S →K} be the set of all possible assign- ments. Let π : A → [0,1] be a probability distribution Fig. 1: A toy example. Users are shown in white circles and over the set of sniffer assignments. We refer to such a sniffers are shown in black circles. Sniffer range indicates distribution as a mixed strategy. A pure strategy that weather or not a sniffer can capture a user’s transmissions. selects a single channel per sniffer is a special case of mixed strategies, namely, π(a) = 1. It follows that the In the user-centric model, the transmission probabili- pure strategy is generally suboptimal comparing to the tiesoftheusersp={p |u∈U}areknownandassumed u mixed strategy. However, as shown in the next section, to be independent 3. p denotes the transmission prob- u the optimal solution can be obtained using just a pure ability of user u. p and G can be estimated by putting u strategy. all sniffers in the same channel and iterating through all In this paper we consider the problem of finding the possible channels for sufficiently long time. Each user monitoring strategy that maximizes QoM, defined as process may be IID or non-IID over time. the expected number of users detected given the sniffer Consider a wireless network with 2 sniffers and 2 assignments. The main notations used in this paper are users on 2 channels (Figure 1). User u and u are 1 2 summarized in Table 1. active on channels 1 and 2, respectively. Transmission probabilities of users are p = 0.2 and p = 0.5. User- 1 2 3.2 ModelsforObservingUserAccessPatterns centric model assumes G and p={p ,p } are available. 1 2 In this section, two categories of parametric models are Note that the maximum value of QoM in the above proposedtodescribetheobservabilityofusagepatterns. network is 0.7 attained when s and s are assigned to 1 2 We assume time is separated into slots, where each slot channels 1 and 2, respectively. represents a fixed duration of time. A user is active if Sniffer-centric model: The user-centric model requires there exists a transmission event from the user during detailed knowledge of each user’s activities. This neces- the slot time. In the experiments, slot time is chosen to sitates frame-level capturing capability by the passive be on the same order of maximum packet transmission monitoring system. In the sniffer-centric model, only time. Furthermore, we assume all channel and users’ binary information (on or off) of the channel activity at statistics remain stationary for the monitoring period of each sniffer is observed. T time slots. We denote by x the binary vector of observations k User-centric model: First, we consider transmission events in the network from the user’s viewpoint. We 3. The assumption that user activities are independent has been widelyadoptedinliterature,examplesare[25]and[26].Inthesimu- assumethatGisknownbyinspectingthepacketheader lation evaluation in Section 6, the proposed algorithms are shown to information from each sniffer’s captured traces. performwellevenwhensuchindependencyisviolated. 5 when all sniffers operate on channel k and by X 4.1 MAX-EFFORT-COVERAGEproblem k the collection of T realizations of x . We assume that k Under the user-centric model, the objective to find the sniffers observations on different channels are indepen- sniffer-channel assignment that can monitor the largest dent. However, dependency exists among observations (weighted) set of users, subject to the constraint that of sniffers operating in the same channel (as a result of each sniffer can only monitor one of the K channels transmissions made by the same set of users). Given an at a time. We henceforth refer to the problem as MAX- assignmenta,acompletecharacterizationofthesniffers’ EFFORT-COVERAGE(MEC)problem.NotethatinMEC observationsisgivenbythejointprobabilitydistribution the weights can in fact be any non-negative values and Pa(xk), k = 1,...,K. Here, Pa(xk) is implicitly depen- arenotlimitedto[0,1].TheMECproblemcanbecastas dent on the assignment a such that if sniffer i is not the following integer program (IP): assignedtothek’thchannel,itsbinaryobservationx (i) k (cid:80) isalwayszero.Byindependenceofdifferentchannelswe max p y have Pa(x)=(cid:81)Kk=1Pa(xk). sz.t. (cid:80)uK∈Uzu u≤1 ∀s∈S Consider again the network in Figure 1. Over T time y k≤=1(cid:80)s,k z ∀u∈U (4) slots, we have two observation matrices X and X u s∈N(u) s,c(u) 1 2 y ≤1 ∀u∈U at the same dimension (2 × T) corresponding to the u y ,z ∈{0,1} ∀u,s,k. activities on two channels. The first and second line in u s,k each matrix contain observations from sniffers s1 and Each sniffer is associated with a set of binary decision s2, respectively. Sniffer-centric model assumes only the variables, zs,k =1 if the sniffer is assigned to channel k; availabilityofX1 andX2,whileGandpareunknown. 0,otherwise.yuisabinaryvariableindicatingwhetheror Clearly, the sniffer-centric model is not as expressive not user u is monitored, and p is the weight associated u as the user-centric model (formally characterized in Sec- with user u. The objective function characterizes the tion 5.1). However, it has the advantage of being based number of (weighted) users that can be monitored with on aggregated statistics, which are likely to remain sta- assignment z. tionaryinthepresenceofmoderateuser-leveldynamics, One should first note that the problem is trivial if such as joining and leaving the networks, or changes K = 1, since all sniffers would simply be assigned to in transmission activities (e.g., busy or thinking time). the sole available channel. We can therefore assume that Furthermore, obtaining such binary information is less K ≥ 2. The MEC problem can be viewed as a special costly in both hardware requirements and communica- case of the MC-GBC (mentioned in Section 2), where tion/storage complexity. all sniffers are used. One should note that previous hardnessresultsforMC-GBC(bothNP-hardness,aswell ashardnessofapproximation)werebasedonareduction 4 QOM UNDER THE USER-CENTRIC MODEL to the standard maximum coverage problem. It follows that none of these proofs are applicable to the MEC Undertheuser-centricmodelthegoalistomaximizethe problem.Surprisingly,therehasnotbeenanyworkdone expected number of active users monitored. Recall that explicitly on the MEC problem, which seems to be a pu isthetransmissionprobabilityofuseru.Thisproblem naturalandimportantvariantofthemaximumcoverage can be formulated formally by: problem. (cid:80) (cid:80) max p π(a) u∈U u a∈A(u) π(a) 4.2 HardnessofMEC s.t. π(a)∈[0,1] (1) (cid:80) In what follows we show that the MEC problem is NP- π(a)=1, a∈A hardforK ≥2,evenfortheunweightedcase(i.e.,where where A(u) is the set of assignments that monitors pu =1 for all u∈U). The hardness of the MEC problem user u, i.e., A(u) = {a|∃s∈N(u) s.t. a(s)=c(u)}. The actually follows from the choices available to the differ- ent sniffers. It is inherently different from the hardness objectivefunctioncalculatestheopportunityforallusers suggestedfortheMC-GBCproblem,whichfollowsfrom tobemonitoredgiventheprobabilityofeachassignment limiting the number of sniffers one is allowed to use. (QoM). It can be written as, We prove hardness of MEC using a reduction from (cid:88) (cid:88) π(a) p ·I , (2) the problem of Monotone-3SAT (MON3SAT), which is u {a∈A(u)} known to be NP-hard (see [27], [28]). In MON3SAT we a∈A u∈U are given as input an instance of 3SAT where every whereI{·} isanindicatorfunction.FromEq.(2)itisclear clause consists of either solely positive variables, or that a pure strategy can be adopted and is optimal, i.e., solelynegatedvariables.Thegoalistodecidewhetheror an optimal assignment is given by notthereexistsanassignmentwhichsatisfiesallclauses. In [29], we proved that the the unweighted MEC (cid:88) a∗ =argmax pu·I{a∈A(u)}. (3) problem is NP-hard, even for K =2. The result implies a u∈U that one would have to settle for approximate solutions 6 to MEC. We first note that Guruswami and Khot show 5 QOM UNDER THE SNIFFER-CENTRIC in[30]thatMON3SATisNP-hardtoapproximatewithin MODEL a factor of 7/8+ε for every ε > 0. The following is a The user-centric model is more expressive than the corollary of the above fact: sniffer-centric model, which assumes the availability of Corollary 1: The MEC problem is NP-hard to approx- thebinaryobservationmatrixX only.However,wewill imate to within a factor of 7/8+ε for every ε>0. show in this section the two models are intrinsically connectedbydevisingalgorithmstoinferGandpfrom X. 4.3 AlgorithmsforMEC Recall that in the sniffer-centric model, given an as- SinceMECisaspecialcaseoftheMC-GBCproblem,we signment a ∈ A, (cid:81)K P (x ) is the probability dis- k=1 a k can use the available approximation algorithms for MC- tribution of binary observations from m sniffers. Let GBC (e.g., [20], [19]) to solve our problem in the user- w(x )bethenumberofactiveuserscapturedbysniffers k centric model. In what follows we give a brief overview in channel k given sniffer observations x . The MEC k of the algorithms we use. problem under the sniffer-centric model is defined as follows. The Greedy algorithm: The Greedy algorithm itera- tively assigns sniffers to users, where at each step it max (cid:80)a∈Aπ(a)(cid:80)Kk=1E[w(xk)] π(a) choosesthesnifferandtheassignmentthat(locally)max- s.t. π(a)∈[0,1] (5) imizestheweightofcoverageofthosenotyetmonitored (cid:80) π(a)=1, users. a∈A It is proven in [20] that in the unweighted case, i.e., The expectation is with respective to Pa(xk). QoM in where all users have the same weight, Greedy guaran- sniffer-centric model can be explained as the expected teestoproducea 1-approximatesolution,andthatthisis number of active users captured on all channels given 2 tight. The following theorem shows that the same holds the assignment probabilities. Clearly, a pure strategy also for the weighted case, which generalizes the MEC suffices, i.e., there exists an optimal assignment such problem. that, K Theorem 2: Greedyisa 12-approximationalgorithmfor a∗ =argmax(cid:88)E[w(x )]. (6) k the weighted MC-GBC problem. a k=1 LP-based algorithm: Thisalgorithmisbasedonsolving Even with pure strategies, the optimization problem the LP-relaxation of the IP formulation for MEC appear- defined in (5) is still challenging to solve directly. The ing in (4). Once we have an optimal solution to the LP- main difficulty arises from the evaluation of E[w(x )]. k relaxation, we round the fractional solution into an integral Givenx ,onecannotdecidehowmanyusersareactive. k solution, with e.g., the probabilistic rounding technique Consider two scenarios. In the first case, two users are of Srinivasan [31]. We next sketch the basic idea of this observedbytwosniffersrespectively.Inthesecondcase, probabilistic rounding technique. Let z∗ be an optimal a single user is observed by both sniffers. From bi- solution to the LP relaxation of (4), and let s be any nary observations alone, one cannot distinguish the two sniffer.If(cid:80) z∗ >0,onecanviewtheinducedsolution cases, which correspond to different number of active k s,c z∗ :C →[0,1]asaprobabilitymeasureoverthedifferent users.Furthermore,incontrasttotheuser-centricmodel, s channels(vianormalization).Thegoalistodecideonan where transmission activities from different users are integral channel assignment for s, namely, setting each independent, observations of sniffers are correlated. As z∗ toavaluein{0,1}suchthatexactlyonevariableout a result, P (x ) cannot be simplified as a product form. s,c a k of the k variables corresponding to sniffer s is set to the This motivates us to exploit the underlying (though not value1.Thealgorithmbuildsabinarytreewhoseleaves directlyobservable)independenceamongusers,andmap correspondstothekvariableszs,k associatedwithsniffer the optimization problem in sniffer-centric model to QoM s, and pairs unset variables in a bottom-up fashion. under the user-centric model. The pairing is made such that an internal node sets at In the sniffer-centric model, each sniffer only reports least one of the variables corresponding to its children. binary output regarding the activities in the channel This is done while adjusting the (probability) value of currently monitored by that sniffer, and thus the access the (other) unset variable. This approach is proven to probability of the users as well as the bipartite graph G, produce a valid assignment in linear time [31]. We refer are both hidden. Recall that G refers to the adjacency to the above algorithm as ProbRand. binary matrix of G. We first derive the sufficient and Theorem 3: ProbRand is a (1−1/e)-approximation al- necessary conditions for unraveling the transmission gorithm for the weighted MC-GBC problem. probabilities of the users given G and P(x). We note that the approximation guarantee of the LP- Let y = {y ,y ,...,y }T be a vector of n binary 1 2 n basedalgorithmsarebestpossiblefortheMC-GBCprob- random variables, where y = 1 if user j transmits in j lem. However, this lower bound does not necessarily its associated channel, and y = 0 otherwise. y is the j k hold for the MEC problem. vector of activities for users transmitting on channel k 7 y 1 3 Proof: 1 2 3 4 5 6 7 8 9 10 2 2 3 ⇒ It is easy to see that the necessary condition holds. 1 4 5 If two users have the same set of sniffer neighbors, unless packet headers are analyzed, their activities 6 9 5 cannot be distinguished. 8 4 10 ⇐ To prove the sufficient condition, we construct a 7 1 2 3 4 5 x procedure to determine p from sniffer’s joint dis- tribution P(x). 1 1 0 1 0 1 0 0 0 0 x1 Case 1: First, we consider a more restrictive con- 0 1 0 1 0 0 0 0 0 0x2 straint, namely, ∀uj (cid:54)= uj(cid:48) ∈ U, N(uj) (cid:54)⊆ N(uj(cid:48)). We G=0 0 1 0 1 0 0 0 0 0x3 letgj denotethej’thcolumnoftheadjacencymatrix 0 0 0 0 0 1 1 1 0 0x4 G, i.e., a binary vector of length m. In other words, 0 0 0 1 1 0 0 1 1 1 x5 gj is the coverage vector of the j’th sniffer. Since y1 y2 y3 y4 y5 y6 y7 y8 y9 y10 ∀uj (cid:54)=uj(cid:48) ∈U,N(uj)(cid:54)⊆N(uj(cid:48)),wehavegj∪gj(cid:48) (cid:54)=gj and gj ∪gj(cid:48) (cid:54)=gj(cid:48). From (8), we have Fig. 2: A sample network scenario with number of sniffers (cid:89) m = 5, number of users n = 10, its bipartite graph transfor- P(x=gj)=pj (1−pj(cid:48)), mation and its matrix representation. White circles represent j(cid:48)∈U,j(cid:48)(cid:54)=j independent users, black circles represent sniffers and dashed (cid:81) lines illustrate sniffers’ coverage range. Since P(x=∅)= j∈U(1−pj) (recallby ourabuse of notation that ∅ is the all-zero vector of length m), we have (Pi.(ey.,)u=se(cid:81)rsyijn=1Upkj)(cid:81). Tyhj=e0j(o1in−tpdji)s.trTihbuetpiornodoufcyt fiosrmgiviesndbuye pj = P(x=Pg(jx)+=Pgj()x=∅). (9) to the independence among users’ activities. The main question we aim to answer is: given the vector x of Case 2: Now we consider the case when the con- k sniffers’ observations, what knowledge can be obtained dition ∀uj (cid:54)= uj(cid:48) ∈ U, N(uj) (cid:54)⊆ N(uj(cid:48)) is violated. regarding y ? Throughout this section, unless otherwise Without loss of generality, assume only one such specified,wkelimitthediscussiontousersandsniffersin pair (j,j(cid:48)) exist and N(uj(cid:48)) ⊆ N(uj) (analysis for a fixed channel k, and drop the subscript. We will also more complicated cases follows the same line of denote by g the entry in the i’th row and j’th column arguments). In this case, pj(cid:48) can be derived as in ij of G. the previous case. However, equation (9) does not Using the adjacency matrix, and using ∧ to represent holdforuserj anymoresinceifx=gj,userj(cid:48) may Boolean AND and ∨ to represent Boolean OR, we have or may not be active. More specifically, the following: (cid:89) P(x=gj)=pj (1−pj(cid:48)(cid:48)). (10) (cid:95)n j(cid:48)(cid:48)∈U,j(cid:48)(cid:48)(cid:54)=j,j(cid:48) x = g ∧y , i=1,...,m, (7) i ij j Therefore, we have j=1 P(x=g )(1−p ) i.e., xi = 1 iff there exists a user j within the range of pj = P(x=g )(1−jp )+Pj(cid:48)(x=∅). (11) sniffer i (g = 1) that transmits (y = 1). Define the j j(cid:48) ij j set Y(x) = {y | (cid:87)nj=1gij ∧yj = xi,∀i}, i.e., the set of Inotherwords,theactiveprobabilityofuserscanbe useractivityprofilesthatareconsistentwiththesniffers’ computedbyconsideringtheuserswiththesmallest observations. Therefore, degree in G first, and then applying (11) iteratively (cid:88) in ascending order of node degree. P(x)=P(y∈Y(x))= P(y) (8) y∈Y(x) The above theorem essentially shows that in the An example network with sniffers and users, the corre- sniffer-centric model, if G is known, then one can ef- sponding bipartite graph, and its matrix representation fectively determine the transmission probabilities of the G are given in Figure 2. users. In presence of measurement noise, methods such as Expectation-Maximization can be applied. We there- 5.1 Relationship between the user-centric and fore obtain an instance of the problem corresponding to sniffer-centricmodelswithknownGandunknownp our user-centric model, which can be solved efficiently using the algorithms described in Section 4.3. The necessary and sufficient conditions that uniquely determine p using G and P(x) is characterized in the Comment: Though Theorem 4 requires the users be following theorem. connected to different sets of sniffers, violation of the Theorem 4: Given G = (S,U,E), p can be uniquely condition would not affect the channel assignment. For determined by P(x) iff ∀uj (cid:54)=uj(cid:48) ∈U, N(uj)(cid:54)=N(uj(cid:48)). example, when two users u and v are connected to the 8 same set of sniffers, and are thus “indistinguishable” Λ is the diagonal scaling matrix with λ =maxstep((cid:96) ), ii i in the binary sniffer observations, we can effectively where (cid:96) is the i’th column of Ł, and i view them as a single user with active probability (cid:26) 1 − (1 − pu)(1 − pv) if users are active independently, maxstep(r)= max(r) if |max(r)|>|min(r)| or p +p if only one user can be active at a time (e.g., min(r) otherwise. u v (14) due to CSMA). Λ scales the elements in the mixing matrix to the max- imum value 1. The matrix T contains thresholds, such 5.2 InferenceofunknownGandpusingbinaryICA that the higher the threshold value, the sparser Gˆ is. In this section, we derive methods to estimate the un- knownmixingmatrixGandtheactiveprobabilityvector Estimation of P(y): Once Gˆ is determined, P(y) needs p.ConsideragaintheexampleinFigure1.Letsnifferss1 to be estimated. From xi = U(gˆiyi), where gˆi is the i’th and s be assigned to channel 1 and observe the activity row of Gˆ (i.e., the estimated coverage vector of sniffer 2 of a single user y1. In this case, x1 =x2. Therefore, si), we have, P(x) = P(x )I (cid:89) 1 {x1=x2} p(xi =0)= p(yj =0). (15) = P(y =x ) 1 1 (cid:26) p , x =1 gˆij=1 = 1 1 . 1−p1, x1 =0 The product is due to the independence of yi’s. Taking Therefore, if the joint distribution of x is the product log(·) on both sides, we have k of a marginal distribution with an indicator function, (cid:88) log(p(x =0))= log(p(y =0)). (16) and the two marginal distributions are identical, we can i j infer that both sniffers observe the same set of users. gˆij=1 Generally, the joint distribution of x preserves a certain Let α = log(p(x = 0)), and β = log(p(y = 0)). Define stochastic “structure” of the user’s activities. We will i i i j α = {α ,α ,...,α }T, and β = {β ,β ,...,β }T. We formalize this observation in the subsequent section by 1 2 m 1 2 n can calculate p(y = 0) (and consequently obtain P(y)) devising two inference methods to estimate G and p j by solving the following optimization problem. from P(x). min (cid:107)α−Gˆβ (cid:107)2 5.2.1 QuantizedLinearICA(QLICA) β (17) First we will estimate G by applying the classic ICA s.t. β <0, on the binary data followed by a quantization process. ThenP(y)canthenbecalculatedbysolvingaquadratic where (cid:107)·(cid:107) is the second norm of a vector. The objective function minimizes the distance between the real obser- programming problem. vation vector x and its reconstructed counterpart (Gˆβ). Estimation of G: The problem is similar to what Clearly, this is a constrained quadratic programming wasaddressedbytheIndependentComponentAnalysis problem with a positive semi-definite matrix (i.e., all (ICA)scheme[12],wheretheobserveddataisexpressed eigenvalues are non-negative), and can be solved in as a linear transformation of latent variables that are polynomial time. non-Gaussian and mutually independent. Classic ICA assumes that both y and x are continuous random Channelselection: Withtheestimatedpˆ andGˆ athand, variables and that x is the outcome of a linear mixing of we effectively transform the sniffer-centric model to the y, and thus is not directly applicable to our problem. user-centricmodel.MethodsdescribedinSection4.3can We adopt the algorithm presented in [32] with some then be applied to determine the channel assignment of modifications. The basic idea is as follows. eachsniffer.QuantizeICAalgorithmwhichinferspˆ and Wefirstobservethat(7)canbesimplifiedusinglinear Gˆ is presented in Algorithm 1 and the complete QLICA mixing and a (coordinate-wise) unit step function. scheme is illustrated in Figure 3(a). x=U(Gy), (12) Algorithm 1: Quantized linear ICA inference where U(·) is a unit step function defined by U(r) = I . By applying the standard ICA on x, we can QuantizeICA(X) {r>0} input :DatamatrixXm×T “decompose” the observation to x ≈ Łs, with Ł is the init :T =thresholdmatrix; linear mixing matrix and s is the collection of random 1 Ł=mixingmatrixobtainedbyapplyingICAonX; sources. However, both Ł and s are not the solutions 23 ΛGˆ==dUi(aŁgΛon−a1l−scaTli)n;gmatrixcalculatedfromŁ; to our problem since they contain fractional values. 4 CalculateαfromX withαi=log(p(xi=0)); Therefore,wequantizeŁtogettheinferredbinarymixing 5 Obtainpˆ bysolvingthequadraticprogrammingproblemin(17); matrix Gˆ as follow, 6 output:pˆ andGˆ Gˆ =U(ŁΛ−1−T). (13) 9 X L 𝑮 ,𝑿 𝛲(𝒚 ) z s,c Quadratic Linear ICA Quantization MEC Programming (a) Quantized Linear ICA X 𝑮 ,𝛲(𝒚 ) z s,c Binary ICA MEC (b) Binary ICA Fig. 3: Channel selection algorithm under QLICA and BICA models A toy example: We next give a simple example, which Joint estimation of G and P(y): The basic idea of provides insight as to the operations of QLICA. Let BICAalgorithmisasfollow:givenanobservationmatrix us reconsider the network in Figure 1 with u and u X from m sniffers, we will first assume that there exists 1 2 operateononesinglechannel.WithT =10observations, at most 2m distinguishable users. Each user is repre- supposedly we have the activity matrix sented in G by a unique column ∈ {0,1}m indicating (cid:20) (cid:21) its connections to m sniffers. 2m users are ordered by 0 0 1 0 0 0 0 0 1 0 Y = . ascending values of their corresponding columns. We 0 1 1 0 0 1 1 1 0 0 will recursively construct 2 submatrices from X such yij = 1 indicates that user yi is active on the channel that the first submatrix captures the joint distribution at tim(cid:20)e slo(cid:21)t j. Y is hidden and unknown to us. Since of activities from the first 2m−1 users (in product form 1 0 G= , we have the observation matrix due to the independence assumption). If the submatrix 1 1 (cid:20) (cid:21) is small enough, the join distribution can be inferred 0 0 1 0 0 0 0 0 1 0 X = . directly, otherwise, a divide-and-conquer approach is 0 1 1 0 0 1 1 1 1 0 taken. Once the join distribution of the first 2m−1 users Applying the linear ICA, we obtain are available, that of the remaining 2m−1 users can be (cid:20) 0.30 −0.35(cid:21) (cid:20)−2.89 0 (cid:21) inferred from the second submatrix. Ł= −0.20 −0.35 ,Λ−1 = 0 −2.89 . Next, we present in detail the proposed BICA al- gorithm. Let define X0 to be a submatrix of With threshold T =0.5, solving the equation (13) and (h−1)×T X, where the rows correspond to observations of the optimization problem (17), we have the following x ,x ,...,x for t = 1,2,...,T such that x = 0, 1 2 h−1 ht inferred results. i.e., the first submatrix mentioned above. Also, define (cid:20) (cid:21) 0 1 X to be the matrix consisting the first h − 1 Gˆ = ,pˆ ={0.5,0.2}. (h−1)×T 1 1 rows of X, i.e., the second submatrix. Let F(.) be the frequency function of some event, we have the iterative Inferred results Gˆ and pˆ are actually permutations of inference algorithm as illustrated in Algorithm 2. the original mixing matrix G and the active probability When the number of observation variables m = 1, p. We see that QLICA can successfully infer information there are only two possible unique sources, one that can regarding the underlying model from P(x). be detected by the monitor x , denoted by [1]; and one 1 that cannot, denoted by [0]. Their active probabilities can easily be calculated by counting the frequency of 5.2.2 BinaryICA(BICA) (x =1)and(x =0)(lines1–3).Ifm≥2,pandGare 1 1 Instead of applying a quantization process on the result estimatedthrougharecursiveprocess.X0 issam- of linear ICA, we can apply the BICA algorithm pro- pled from columns of X that have x =(0m.−I1f)X×T0 posedin[13]todetermineP(y)andGbyexploitingthe m (m−1)×T is an empty set (which means x = 1,∀t) then we can OR mixture model between y and the observation vari- mt associatex withaconstantlyactivecomponentandset able x. Compared with QLICA, BICA explicitly account m the other components’ probability accordingly (lines 4 – forthegenerativemodelandthusleadstomoreaccurate 7). If X0 is non-empty, we invoke FindBICA on estimationresults.However,thiscomesattheexpenseof (m−1)×T twosub-matricesX0 andX todetermine higher computation complexity. In the worst case, given (m−1)×T (m−1)×T m sniffers, the run time of the algorithm is O(m2m).4 pˆ1...2m−1 and pˆ(cid:48)1...2m−1, then infer pˆ2m−1+1...2m (lines 8 – 12).Finally,pˆ anditscorrespondingcolumngˆ inGˆ are Forcompleteness,wefirstdefinesomenotationandthen h h pruned in the final result if pˆ <ε (lines 13 – 15). outline the BICA algorithms. h Channel selection: Now that pˆ and Gˆ are inferred, we 4.Severaltechniquesaresuggestedin[13]toreducethecomputation complexity. again transform the sniffer-centric model to the user- 10 Algorithm 2: Incremental binary ICA inference an extensive evaluation on performance of QLICA and FindBICA(X) BICA. input :DatamatrixXm×T init :n=2m−1; pˆ =1×nzerovector; 6 SIMULATION VALIDATION Gˆ =m×(2m−1)matrixwithrowscorrespondingallpossible In this section we evaluate the performance of differ- binaryvectorsoflengthm; ε=theminimumthresholdforph tobeconsideredareal ent algorithms under the user-centric and sniffer-centric component; models using both synthetic and real traces. Synthetic 1 ifm=1then traces allow us to control the parameter settings while 2 pˆ1=F(x1=0); real network traces provide insights on the performance 3 pˆ2=F(x1=1); under realistic traffic loads and user distributions. else 4 ifX0(m−1)×T =∅then In addition to the Greedy and LP-based algorithm, 5 pˆ1...2m−1 =FindBICA(X(m−1)×T); we also consider Max Sniffer Channel (Max) where a 6 pˆ2m−1+1=1; sniffer is assigned to its busiest channel. This scheme is 7 pˆ2m−1+2...2m =0; the most intuitive approach in practical networks where else 8 pˆ1...2m−1 =FindBICA(X0(m−1)×T); the user model is not available and sniffers have to de- 9 pˆ(cid:48)1...2m−1 =FindBICA(X(m−1)×T); cide their channel assignment non-cooperatively based on 10 forl=2,...,2m−1 do local observations. Note it is easy to construct scenarios 11 pˆl+2m−1 =1− 11−−ppˆˆ(cid:48)ll; whereMaxperformsarbitrarilybad.Thus,itsworstcase 12 pˆ2m−1+1= (cid:81)Fl(=x1m...=2m1∧−x1i,=l(cid:54)=02,∀mi−∈[11+..1.m(1−−1pˆ])l); ptheerfQorLmICanAcemiosduenl,bowuenduseedd. FthoretFhaestiInCfeAreanlcgeorsicthhemme[1i2n] 13 forh=1,...,2m do to compute the linear mixing matrix Ł. 14 if(pˆh<ε)∨(pˆh=0)then 15 prunepˆh andcorrespondingcolumngˆh; 6.1 QoMunderdifferentmodels 16 output:pˆ andGˆ 6.1.1 Synthetictraces centric model and apply methods Section 4.3 to find op- timal sniffer-channel assignment scenario. The complete BICA scheme is illustrated in Figure 3(b). A toy example: Again, let us reconsider the network in Figure 1. Recall that p = {0.2,0.5}, T = 10 and the observation matrix is (cid:20) (cid:21) 0 0 1 0 0 0 0 0 1 0 X = . 0 1 1 0 0 1 1 1 1 0 Following the FindBICA algorithm, we will first de- compose X into X0 and X . 1×T 1×T X0 =(cid:2)0 0 0 0(cid:3), 1×T (cid:2) (cid:3) X = 0 0 1 0 0 0 0 0 1 0 . 1×T From X0 , we have pˆ = 1 and pˆ = 0. Similarly 1×T 0 1 fromX ,wehavepˆ(cid:48) =0.8andpˆ(cid:48) =0.2.Applyingthe 1×T 0 1 procedure in Algorithm 2, we have the inferred results: Fig. 4: Hexagonal layout with users (‘+’), sniffers (solid dots), basestations(triangles),andchannelsofeachcell(indifferent (cid:20) (cid:21) Gˆ = 0 1 0 1 ,pˆ ={1,0,0.5,0.2}. triangle colors) 0 0 1 1 ThefirstandsecondcolumnofGˆ andpˆ willbepruned Inthissetofsimulations,500wirelessusersareplaced since we are not interested in components that cannot randomly in a 500 × 500 square meter area. The area be observed (with gˆ = 0) or components with their is partitioned into hexagon cells with circumcircle of i active probabilities smaller than ε (ε is set to be 0.01 radius 86 meters. Each cell is associated with a base in this case). Thus yields the final result to be exactly station operating in a channel (and so are the users equal to the ground truth. Toy example shows that both in the cell). The channel to base station assignment QLICA and BICA can accurately predict the underlying ensures that no neighboring cells use the same channel. modelwithoutprioryknowledgeonthelatentvariables’ 25 Sniffers are deployed in a grid formation separated activities in simple cases. The next section will provide by distance 100 meters, with a coverage radius of 120