ebook img

Model Selection Approach for Distributed Fault Detection in Wireless Sensor Networks PDF

0.23 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Model Selection Approach for Distributed Fault Detection in Wireless Sensor Networks

Model Selection Approach for Distributed Fault Detection in Wireless Sensor Networks Mrinal Nandi a∗, Anup Dewanji b, Bimal Roy b and Santanu Sarkar c aDepartment of Statistics, West Bengal State University, Barasat, India; bASD, Indian Statistical Institute, 203 B. T. Road, Kolkata-700 108, India; 3 cChennai Mathematical Institute, Chennai, India. 1 0 January 22, 2013 2 n a J Abstract 1 2 Sensor networks aim at monitoring their surroundings for event detection and object tracking. But, due to failure,ordeathofsensors,falsesignalcanbetransmitted. Inthispaper,weconsidertheproblemsofdistributed ] fault detection in wireless sensor network (WSN).In particular, we consider how to takedecision regarding fault I N detection in a noisy environment as a result of false detection or false response of event by some sensors, where thesensorsareplacedatthecenterofregularhexagonsandtheeventcanoccuratonlyonehexagon. Wepropose . s fault detection schemes that explicitly introduce the error probabilities into the optimal event detection process. c [ Weintroduce two typesof detection probabilities, one for thecenter node, where the event occurs and theother one for the adjacent nodes. This second type of detection probability is new in sensor network literature. We 1 developschemes underthemodelselection procedure,multiplemodel selection procedureand usetheconcept of v Bayesian model averaging to identify a set of likely fault sensors and obtain an average predictiveerror. 5 9 Keywords: Event Detection, Wireless Sensor Network, Multiple Model Selection, Bayesian Model Averaging. 7 4 . 1 Introduction 1 0 3 Traditionalandexistingsensor-actuatornetworksusewiredcommunication,whereaswirelesssensornetworks(WSN) 1 provide radically new communication and networking paradigms and myriad new applications. The wireless sensors : v have small size, low battery capacity, non-renewable power supply, small processing power, limited buffer capacity i andlow-powerradio. Theymaymeasuredistance,direction,speed,humidity,windspeed,soilmakeup,temperature, X chemicals, light, and various other parameters. r a Recentadvancementsinwirelesscommunicationsandelectronicshaveenabledthedevelopmentoflow-costWSN. AWSNusually consistsofalargenumberofsmallsensornodes,whichareequippedwithoneormoresensors,some processing circuit, and a wireless transceiver. One of the unique features of a WSN is random deployment in inaccessible terrains and cooperative effort that offers unprecedented opportunities for a broad spectrum of civilian and military applications, such as industrial automation, military surveillance, national security, and emergency health care [2, 20, 1]. Sensor Networks are also useful in detecting topologicalevents such as forest fires [10]. Sensornetworks aimat monitoring their surroundingsfor eventdetection and object tracking[2, 17]. Because of this surveillancegoal,coverage is the functionalbasis ofanysensornetwork. Inorderto fulfillits designatedtasks,a sensornetworkmustfullycovertheRegionofInterest(ROI)withoutleavinganyinternal sensing hole[3,5,6,9]. So far, a number of movement-assistedsensor placement algorithms have been proposed. An exclusive survey on these topics is presented by Li et al. [14]. On the other hand sensor could die or fail at runtime for various reasons such as power depletion, hardware defects, etc. So, even after the ROI is fully coveredby the sensors, wrong information can be communicated by some sensors, or sensors may fail to detect the event due to noise or obstructions. Chen et al. [8] have proposed a distributed localized fault detection algorithm for WSN, where each sensor identifies its own status to be either good or faulty and the claim is then supported or reverted by its neighbors. The proposed ∗Corresponding author. Email: [email protected] 1 algorithm is analyzed using a probabilistic approach. Sharma et al. [21] have characterized the different types of fault andproposeda differentalgorithmforfault detection consideringdifferenttypes offault. Some ofthe methods are statistical, like, using histogram, etc. Both the work can only detect the faulty sensors, but not the event. One of the important sensor network applications is monitoring inaccessible environments. Sensor networks are used to determine event regions and boundaries in the environment with a distinguishable characteristic [13, 7, 19]. Thebasicideaofdistributeddetection[22]istohaveeachoftheindependentsensorsmakealocaldecision(typically, a binary one, i.e., an event occurs or not) and then combine these decisions at a fusion sensor (the sensor which collects the local information and takes the decision), or at a base station to generate a global decision. 1.1 Our Motivation In this paper, we are interested in one particular query: determining event in the environment (i.e., ROI) with a distinguishable characteristic. We assume the ROI to be partitioned into suitable number of congruent regular hexagonal cells (i.e., we can think ROI as a regular hexagonal grid). This physical structure of ROI is not a requirement for the theoretical analysis, we can do the similar analysis with other structure also. Suppose that sensors are placed a priori at the center (which are known as nodes) of every hexagon of the grid. We assume that the sensors are connected to its adjacent sensor nodes in the sense that a hexagon will be strongly covered by its center node andweaklycoveredby the adjacentnodes. If eventoccurs inthe hexagonwherea particularsensorlies, then that particular sensor can detect the event with a greater probability whereas, if event occurs in any adjacent hexagon, then the particular sensor can detect the event with a lesser probability. Hence, only one node (center node of the event hexagon) can detect the event hexagon with greater probability, say p , and adjacent nodes (six 1 for interior nodes and less for boundary nodes) can detect the event hexagon with lesser probability, say p , with 2 p >p . We assume that no other sensorcandetect the eventhexagon. In this paper,unlike the previousworks,we 1 2 assume thatif the eventoccursthen it occurs atonly one hexagonofthe gridwhichwillbe knownas eventhexagon and there is no fusion sensor. All sensors can communicate with the base station and the base station takes the decision about the query. As an example, consider a network of devices that are capable of sensing mines or bombs, if we assume that a few mines or bombs may be placed on a particular area of ROI. Information from these devices can be sent to a nearby police station, or a central facility. Then, an important query in this situation could be whether a particular hexagon is the event hexagon or not (i.e., mines or bombs are placed or not). One fundamental challenge in the event detection problem for a sensor network is the detection accuracy which is disturbed by the noise associated with the detection and the reliability of sensor nodes. A sensor may fail to detect the event due to natural obstruction or any other causes. After detecting the event, a sensor can send false message to the base station due to some technical reasons. The sensors are usually low-end inexpensive devices and sometimes exhibit unreliable behavior. For example, a faulty sensor node may issue an alarm even though it has not received any signal for event, or it cannot detect any event, and vice versa. Moreover,a sensor may be dead, in which case, the sensor cannot send any alarm. 1.2 Our Contribution Inourtheoreticalanalysis,thesensorfaultprobabilitiesareintroducedinto theoptimaleventdetectionprocess. We apply model selection approach, multiple model selection approach and Bayesian model averaging methods [12, 18] tofindasolutionoftheproblem. Wedeveloptheschemesusingthemodelselectiontechnique. Wecalculatedifferent error probabilities and find some theoretical results. Inallpreviousworks,theauthorsassumeonlyonedetectionprobability. Weintroducetwodetectionprobabilities, p and p , one for the center node and other for the adjacent nodes. Even if the center node may fail to detect 1 2 the event, the adjacent nodes may detect the event, and vice versa. We consider these probabilities and show that, in various situations, the adjacent nodes play key role to detect the event. One can introduce more detection probabilities and analyze the situation in similar manner. The parameters p and p , the detection probabilities of a sensor, and error probabilities (see Section 4) cannot 1 2 be estimated from the reallife situations, but need to be estimated beforehand by some experimentation. The prior probabilities of various events also cannot be estimated, but may be known in some cases. Finally, we calculate the error probabilities numerically for some values of the parameters of our model and make some concluding remarks analyzing the results. 2 Figure 1: Nodes placed in centers when ROI partitioned in to regular hexagons adjacent node adjacent node adjacent node center node adjacent node adjacent node adjacent node 2 Previous Work Lou et al. [15] consider two important problems for distributed fault detection in WSN: 1) how to address both the noise-related measurement error and sensor fault simultaneously in fault detection and 2) how to choose a proper neighborhood size n for a sensor node in fault correction such that the energy could be conserved. They propose a faultdetectionschemethatexplicitlyintroducesthesensorfaultprobabilityintotheoptimaleventdetectionprocess. They show that the optimal detection error decreases exponentially with the increase of the neighborhood size. Krishnamachari and Iyengar [13] propose a distributed solution for canonical task in WSN (i.e., the binary detectionofinterestingenvironmentalevents). Theyexplicitlytakeintoaccountthepossibilityofsensormeasurement faults and develop a distributed Bayesianalgorithm for detecting and correcting such faults. Nandiet al.[16]consider the problemofdistributed fault detectionin wirelesssensornetwork(WSN), where the sensors are placed at the center of a particular square (or hexagon) of the grid covering the ROI. They proposed fault detection schemes that explicitly introduce the error probabilities into the optimal event detection process. They developed the schemes under the consideration of Neyman-Pearson hypothesis test and Bayes test. They also calculate type I and type II errors for different values of the parameters. Inalmostallthe previousworks,except[16], authorsassumethat eventoccursoveraregionandtherearefusion sensors that collect the information locally and take a decision. Since they do not introduce the concept of base station there is no concept of response probability. Also, they assume informations are spatially correlated. Unlike the previous work,in this paper, we assume that if event occurs then it occurs at only one cell of the ROI and there is no fusion sensor. All the sensors send informationto the base station. We introduce the probability model in two different stages; firstly, when a sensor detects the event and, secondly, when a sensor sends the message to the base station. In the previous works, only one type of detection probability has been introduced to simulate the different error probabilities for some specific values of the parameters. In this paper, we introduce two different detection probabilities and obtain analytically the exact test and estimate the error probabilities by simulation. In almost all the previous works, authors assume the ROI to be a square grid. The hexagonal grid is better in the sense that minimum number of sensors are required to cover the entire ROI [23]. 3 Statement of the problem and Assumptions In this section, we describe the problem in more specific terms and state the assumptions that we make. 3 Sensorsaredeployed,ormanuallyplaced,overROItoperformeventdetection(i.e.,todetectwhetheraneventof interesthashappenedornot)inROI.Ifsensorsaredeployedfromairthen,usingactuator-assistedsensorplacement or by movement-assisted sensor placement, sensors are so placed that sensor network covers the entire ROI. This ROI is partitioned into suitable number of regular hexagons (i.e., we can think of the ROI as a regular hexagonal grid), as shown in Figure 1. Sensors are placed a priori at every center (which are known as nodes) of the regular hexagons. Sensors have two detection probabilities. The sensor network covers the entire ROI and there is only one event hexagon, as discussed before. Eachsensornode determines its locationthroughbeaconpositioningmechanisms [4]orby exploitingthe Global PositioningSystem(GPS). Throughabroadcastoracknowledgeprotocol,eachsensornode is alsoableto locate the neighborswithinitscommunicationradius. Sensorsarealsoabletocommunicatewiththebasestation. Basestation willtakethedecision. Inthispaper,weassumethat,eventoccursatoneparticularhexagonofthegridwhichwillbe known as event hexagon or event does not occur (in that case we say ROI is normal). All sensors can communicate with the basestationandbasestationtakesthe decisionby combiningthe informationreceivedfromallthe sensors. There are two phases in the whole process. The first one is detection phase, when the sensor at the center of a regularhexagontriestodetecttheevent. Thesensoratthecenteroftheeventhexagoncandetecttheeventhexagon with greater probability p and the sensors at the adjacent nodes (see Figure 1) can detect the event hexagon with 1 lesserprobabilityp . We alsoassume thatthereis apriorprobabilitythataparticularhexagonis aneventhexagon. 2 The next phase is response phase, in which sensors send message to the base station. Even if the event hexagon is detected by a sensor, it may not respond (i.e., send message to the base station that no event occurred in that cell and the neighboring cells due to some technical fault) with some probability; then we say that the sensor is a faulty sensor. Conversely, if event hexagon is not detected, or there is no event hexagon at all (i.e., ROI is normal), then alsoa faulty sensorcansend the wronginformationto the base stationwith some probability. A sensoris saidto be a dead sensor if the sensor does not work. A dead sensor sends no response in either cases. Each sensor sends information to the base station. As the sensors may send wrong information, the base station takes the important role in identifying the event hexagon. Base station will collect all the information and take a decision about the event hexagon according to a rule which we have to find out. Our job is to find a rule for the base station such that base station works most efficiently. 3.1 Notations and Assumption Our problem is to develop a strategy for the base station to take decision about event hexagon (i.e., which hexagon of the ROI is the event hexagon, if at all). Let R be the set of all nodes. For N ∈ R, define B(N), as the set of adjacent node(s) of N and k(N) be the number of adjacent node(s) of N. Hence, 0 ≤ k(N) ≤ 6. Call a node N interiorif k(N)=6. LetS be the sensorwhichis placedatthe node N andH be the hexagonwherethe node N N N is placed (i.e., N is the center of H ). For N ∈R, let X denote the true status of the node N. That is, X = 1 N N N if event occurs at H , and 0 otherwise. Also define Y =0 if S detects no event, and 1 if S detects the event in N N N N H or H , for N′ ∈B(N). Finally define Z =0 if S does not respond, i.e., the sensor informs the base station N N′ N N that eventdoes not occur atH or H for N′ ∈B(N), and Z =1 if S responds,i.e., the sensorS informs the N N′ N N N base station that the event has occurred in H or H , for N′ ∈B(N). N N′ Nowwemakeone naturalassumptionthat, oncedetectionphase iscompleted, responseofa sensordepends only on what it detects but not on whether the event has actually occurred or not, i.e., P(Z = k|Y ,X ) = P(Z = N N N N k|Y ), for k =0,1. We also assume that the sensors work independently and identically. N Since we assume that there is at most one event hexagon, X =1 or 0. N∈R N The possible true scenarios are, therefore, represented by the fPollowing |R|+1 different models: M : (X =0 for all N ∈R), 0 N and, for each N ∈R, M : (X =1 and X =0 for all N′ ∈R\N). N N N′ Let Pr(M )=P(ROI is normal)=p 0 norm and, for all N ∈R, Pr(M )=Pr(event occurs at the hexagon H )=p . N N N In particular, we may assume p ’s to be same for all N. We denote any probability under the model M as P (·) N 0 M0 and under the model M as P (·). N MN We also make the followings assumptions: • For all N ∈R, P (Y =1)=0 and P (Y =1)=p . M0 N MN N 1 4 • For all N′ ∈B(N),P (Y =1)=p , and MN N′ 2 for all N′ ∈R\[B(N)∪{N}],P (Y =1)=0. MN N′ • For all N ∈R,P(Z =1|Y =1)=p and P(Z =1|Y =0)=p . N N c N N w • Z and Y are independent for N 6=N′. N N′ • Theresponsesfromdifferentnodes areindependentunderaparticularmodel, i.e.,Z ’s areindependentunder N M for a fixed N′ ∈R. N′ 4 Theoretical Analysis of fault Detection In this section we discuss some theoretical results. In real situations, |R| may be very large. Given the network of thesensornodesandsomepriorknowledgeaboutthenatureofevent,onemayhavefairlygoodideaaboutthesetof feasibleregionsforthe event. Formally,insteadofallpossiblemodels,onemaybe abletorestrictto asetcontaining all the feasible models. For example, if the event is known to take place in a particular region, we can restrict our models accordingly. 4.1 Model Selection Approach For all N ∈R,P (Z =1) M0 N =P (Z =1|Y =0)P (Y =0)+P (Z =1|Y =1)P (Y =1) M0 N N M0 N M0 N N M0 N =P(Z =1|Y =0)P (Y =0)+P(Z =1|Y =1)P (Y =1)=p . N N M0 N N N M0 N w Hence,underthemodelM ,Z followsBer(p ),forallN ∈R,andthelikelihoodofthedata{Z =z , for all N ∈ 0 N w N N R }, under the model M , is 0 L =P (Z =z , for all N ∈R) 0 M0 N N = pzwN(1−pw)(1−zN) =(pw)PN∈RzN ×(1−pw)PN∈R (1−zN). NY∈R So lnL =Σ z lnp +Σ (1−z )ln(1−p ). 0 N∈R N w N∈R N w For any N ∈R, we have P (Z =1) MN N = P (Z =1|Y =0)P (Y =0)+P (Z =1|Y =1)P (Y =1) MN N N MN N MN N N MN N = P(Z =1|Y =0)P (Y =0)+P(Z =1|Y =1)P (Y =1) N N MN N N N MN N = p (1−p )+p p =p (p −p )+p =P , say. w 1 c 1 1 c w w 1 Hence,forallN ∈R,underM ,Z followsBer(P ). Similarly,forallN′ ∈B(N),underM ,Z followsBer(P ), N N 1 N N′ 2 whereP =p (p −p )+p and,under M ,Z followsBer(p )forallN′ ∈R\[B(N)∪{N}]. NotethatP >P 2 2 c w w N N′ w 1 2 5 since p >p . Hence the likelihood for the model M , given Z =z ,N′ ∈R, is 1 2 N N′ N′ L =P (Z =z , for all N′ ∈R)= N MN N′ N′ PzN(1−P )(1−zN)Π PzN′(1−P )(1−zN′) 1 1 N′∈B(N) 2 2 ×Π pzN′(1−p )(1−zN′) N′∈R\[B(N)∪{N}] w w =P1zN(1−P1)(1−zN)P2ΣN′∈B(N)zN′(1−P2)ΣN′∈B(N)(1−zN′) ×pwΣN′∈R\[B(N)∪{N}]zN′(1−pw)ΣN′∈R\[B(N)∪{N}](1−zN′). Let T = Z , so that (1−Z )=k(N)−T N N′ N′ N N′∈XB(N) N′∈XB(N) with the corresponding observed values denoted by t = z and (1−z )=k(N)−t . N N′ N′ N N′∈XB(N) N′∈XB(N) Therefore, lnL = N z lnP +(1−z )ln(1−P )+t lnP +(k(N)−t )ln(1−P ) N 1 N 1 N 2 N 2 + z lnp + (1−z )ln(p (1−p )) N w N w w N′∈R\[BX(N)∪{N}] N′∈R\[BX(N)∪{N}] P 1−P P 1−P =lnL +z ln 1 +(1−z )ln 1 +t ln 2 +(k(N)−t )ln 2 0 N p N 1−p N p N 1−p w w w w P (1−p ) P (1−p ) 1−P 1−P =lnL +z ln 1 w +t ln 2 w +ln 1 +k(N)ln 2 0 N p (1−P ) N p (1−P ) 1−p 1−p w 1 w 2 w w =a+b(cz +t −dk(N)), say, N N 1−P P (1−p ) where, a=lnL +ln 1,b=ln 2 w >0, 0 1−p p (1−P ) w w 2 lnP1(1−pw) ln1−pw c= pw(1−P1) and d= 1−P2 are independent of N. lnP2(1−pw) lnP2(1−pw) pw(1−P2) pw(1−P2) Inmodelselectionapproach,the modelresultinginthe maximumvalueofthe likelihoodisselected. Notethat,since there is no parameter being estimated, this is equivalent to the well-known Akaike Information Criterion(AIC) [11]. Therefore, the base station will accept the model M if 0 1−P = ln 1 +b(cz +t −dk(N))<0,for all N ∈R. N N 1−p w Otherwise, as b is positive, accept the model M for which (cz +t −dk(N)) is maximum among all N ∈R. If N N N values of (cz +t −dk(N)) are equal for more than one N, then we can select one of the corresponding models N N withequalprobability. IfwewanttomaximizethelikelihoodforthemodelsM correspondingtotheinteriornodes N only, so that k(N) is fixed, then we need to maximize (cz +t ) among all N ∈R. N N 4.2 Multiple Model Selection Instead of selecting one particular model, one may want to select more than one models with approximately similar log likelihood values to the maximum one. We can consider the set of models L {M : K >C}, K max L N∈R N where 0<C <1 is a suitable constantclose to 1. This C is usually chosenaccordingto the resourceavailable. This is similar to the idea of Occam’swindow in the contextof Bayesianmodel selection[12]. This may be interpretedas the interval estimation for the true model. 6 Note that L is an increasing function of cz +t −dk(N), as b is positive. We consider only the following set N N N of models {M :Q >C∗·max Q }, K K N∈R N where Q =cz +t −dk(N), for all N ∈R, with 0<C∗ <1. In particular,if we consider the interiornodes only, N N N then we consider the set of models given by {M :cz +t >C∗·max {cz +t }}. K K K N∈R N N We can select multiple models using some other criteria. One such may be to select all the models (one or more) for which the maximum value of the likelihood is attained. Let N be the set of nodes corresponding to all max these models, including ‘N = 0’ corresponding to M if it has the maximum value of the likelihood. Then this 0 method select all the models M with N ∈ N . By another criterion, one may select the models M , for N max N′ N′ ∈ N ∪[∪ B(N)]; that is, N′ be a node in N or any of the neighboring nodes of a node in N . max N∈Nmax max max Note that B(N) for N = 0 is the empty set. One can combine these two types of criteria and come up with many others. 4.3 Bayesian Model Averaging Bayesian model averaging is an effective method to solve a decision problem when there are many alternative hypotheses or models, which are complicated [12]. Suppose M ,M ,...,M are the models considered and D 1 2 k denotes the given data. The posterior probability for model M is given by k Pr(D|M )Pr(M ) Pr(M |D)= k k , k Pr(D|M )Pr(M ) l l P wherePr(D|M )denotestheprobabilityofobservingdataDunderthemodelM (whichisessentiallythelikelihood k k L under M ) and Pr(M ) is the prior probability that M is the true model (assuming that one of the models is k k k k true). In this work, the data D is {Z = z : N ∈ R} and the models are M ,M ,N ∈ R as defined in Section 3.2. N N 0 N Hence, The posterior probability for model M is N p L N N Pr(M |Z =z ,N ∈R)= , N N N p L +p L l∈R l l norm 0 P p L and that for M is norm 0 . 0 p L +p L l∈R l l norm 0 P WeselectthemodelM ifp L isgreaterthanp L ,forallN ∈R;otherwise,selectM forwhichp L is 0 norm 0 N N N N N maximumamongallN ∈R. Hence,ifp ’sareallequal,thenBayesianapproachissameasthe likelihoodapproach. N 5 Some Important Considerations and Error Probabilities In this section, we consider some important issues related to the problem of fault detection and the proposed methodology including calculation of errors (e.g., false detection, etc.) and detection probabilities. The following probabilities give some idea about the role of neighboring nodes, along with the center node, in detection, or false detection, of event. For example, P (T =0,Z =1) gives the probability of a false detection M0 N N by the Nth node, and not by the neighboring nodes, while P (T = 6,Z = 0) gives the probability of a false MN N N negative by the Nth node, with all the neighboring nodes detecting the event. Since, given a particular model, T N and Z are independent, calculation of such probabilities is simple as given in the following. For any N ∈ R and N i=0,1,...,k(N), 1. P (T =i,Z =0)= k(N) pi (1−p )k(N)−i+1 M0 N N i w w (cid:0) (cid:1) 2. P (T =i,Z =1)= k(N) pi+1(1−p )k(N)−i M0 N N i w w (cid:0) (cid:1) 7 3. P (T =i,Z =0)= k(N) Pi(1−P )k(N)−i(1−P ) MN N N i 2 2 1 (cid:0) (cid:1) 4. P (T =i,Z =1)= k(N) Pi(1−P )k(N)−iP . MN N N i 2 2 1 (cid:0) (cid:1) Note that, for N ∈R,P (L >L )=P (lnL >lnL )= M0 N 0 M0 N 0 P (1−p ) P (1−p ) 1−P 1−P P Z ln 1 w +T ln 2 w +ln 1 +k(N)ln 2 >0 M0(cid:18) N p (1−P ) N p (1−P ) 1−p 1−p (cid:19) w 1 w 2 w w P (1−p ) P (1−p ) 1−p 1−p =P Z ln 1 w +T ln 2 w >k(N)ln w +ln w , M0(cid:18) N p (1−P ) N p (1−P ) 1−P 1−P (cid:19) w 1 w 2 2 1 which can be numerically obtained using the joint distribution of T and Z under the model M . The maximum N N 0 oftheseprobabilitiesoverallN givesalowerboundfortheprobabilitythatanodeisconsideredtobeaneventnode when the ROI is normal. On the other hand, the sum over all N gives an upper bound for the same. Similarly, for N ∈R, P (L <L )= MN N 0 P (1−p ) P (1−p ) 1−p 1−p P Z ln 1 w +T ln 2 w <k(N)ln w +ln w , MN (cid:18) N p (1−P ) N p (1−P ) 1−P 1−P (cid:19) w 1 w 2 2 1 which can be again numerically obtained using the joint distribution of T and Z under the model M . This N N N probability gives some idea about the error that, when Nth node is the event node and it is not detected. AsnotedinSection4.1,weselectthemodelM forwhichQ isthemaximum,forN ∈R. Therandomvariable N N Q is,therefore,ofsomeinterest,thedistributionofwhichunderdifferentmodelsisusefulincalculatingmanyerror N probabilities. We firstfind the distribution of Q under the model M . Note that Q takes values ci+j−dk(N), N N N correspondingtoZ =iandT =j,fori=0,1,andj =0,1,2,...,k(N). Assumethat,forconvenience,the values N N of Q for different i and j are all distinct. Therefore, for i=0,1 and j =0,1,...,k(N), N k(N) P (Q =ci+j−dk(N))= (P )i(1−P )(1−i)(P )j(1−P )(k(N)−j) MN N (cid:18) j (cid:19) 1 1 2 2 k(N) and, P (Q =ci+j−dk(N))= (p )i+j(1−p )(1−i+k(N)−j). M0 N (cid:18) j (cid:19) w w ForN′ ∈B(N),orN′ ∈R\[B(N)∪{N}],onecanfindP (Q =ci+j−dk(N))insimilarmanner,althoughthe MN′ N calculation is very tedious as there are many sub-cases. Ideally, one is interested in probability of errors occurring at the level of base station. For example, the two important errors are: (1) not selecting M when M is true 0 0 (false positive), and (2) selecting M when M is true for some N ∈R (false negative). Theoretical calculation of 0 N these error probabilities is complicated. We, therefore, use simulation technique to estimate these and similar error probabilities. 6 Simulation Study We consider a 32 × 32 hexagonal grid and we run the programme 10000 times. The simulation is performed using the C-code, and required random numbers are generated using the standard C-library. Inoursimulationstudy,weconsiderdifferentcriteria,asdiscussedinSections4.1and4.2,forestimatingtheerror probabilities, or equivalently, the success rate. First consider the probability of selecting M , when it is true. Let 0 S denote the proportionofcorrectdetectionofnormalsituation, whenmodel M is true, using the model selection 1 0 method of Section 4.1. That is, S gives an estimate of P (0 ∈ N and 0 is selected by randomization). Then 1 M0 max 1−S gives an estimate of the false positive rate. 1 When M is true for some N ∈R, let S denote the proportion of correct decision for the event node using the N 2 modelselectionmethod ofSection4.1, so thatit estimates P (N ∈N and is selected by randomization). Note MN max that, for eachsimulationrun, the eventhexagonis chosenrandomlyso that S gives anaveragevalue overall N. In 2 this context, this probability is same for all the interior nodes. Then, 1−S gives an estimate of the corresponding 2 error probability of not selecting M , when it is true. N 8 Note that, in this problem of fault detection with a single event node, the likelihood value, for a given observed data configuration, may be equal for more than one models. Therefore, quite often, the maximum value of the likelihoodmaybe attainedbymorethanonemodel. ThemodelselectionmethodofSection4.1,whichselectsoneof these models randomly in such cases, may often not select the correct model. Therefore, the method of Section 4.2, which selects more than one models having similar likelihoodvalue, may be preferredand will have better chance of selecting the correct model. We now consider some of those methods in the following. Let us first consider the method in which all the models corresponding to the maximum value of the likelihood are selected. Let S denote the proportion of correct selection of the model M , when it is true, by this method. 3 N Then S estimates the probability P (N ∈N ), which is always more than or equal to the quantity estimated 3 MN max by S , as remarked before. We also consider the method in which all the models having maximum likelihood along 2 with their neighborhood models are selected. A model M is a neighborhood model of the model M if N′ is a N′ N neighboring node of N. If S denotes the proportion of correct selection of the model M , when it is true, by this 4 N method, then S estimates P (N ∈ N ∪{∪ B(N′)}). Clearly, S ≥ S ≥ S . Similarly, if S denotes 4 MN max N′∈Nmax 4 3 2 5 the proportion of correct selection of the model M , when it is true, by selecting all those models with likelihood N value being more than 90% of the maximum likelihood (that is, the method of Section 4.2 with C = 0.9) then S 5 estimates the probability P (L >0.9L ) with L denoting the maximum value of the likelihood. MN N max max Table 1: Simulation of estimated probabilities for some values of the parameters otherparameters Simulationofdifferentprobabilitieswithpc=0.9 p1 p2 pw S1 S2 S3 N3 S4 N4 S5 N5 0.9 0.0 0.01 0.00 0.08 0.81 18.16 0.81 59.85 0.82 18.17 0.9 0.3 0.01 0.00 0.47 0.69 5.44 0.70 14.81 0.73 5.70 0.9 0.4 0.01 0.00 0.60 0.79 5.11 0.78 13.51 0.80 5.12 0.9 0.5 0.01 0.00 0.70 0.85 4.64 0.85 11.50 0.86 4.88 0.9 0.6 0.01 0.00 0.79 0.90 3.82 0.91 08.18 0.92 3.90 0.9 0.0 0.001 0.35 0.50 0.81 3.17 0.81 7.17 0.81 3.18 0.9 0.3 0.001 0.36 0.59 0.82 3.03 0.83 6.34 0.86 3.06 0.9 0.4 0.001 0.35 0.67 0.87 2.89 0.87 6.18 0.89 3.03 0.9 0.5 0.001 0.36 0.75 0.90 2.89 0.89 5.85 0.93 2.96 0.9 0.6 0.001 0.36 0.83 0.94 2.74 0.93 5.33 0.96 2.79 0.99 0.0 0.01 0.00 0.08 0.89 17.43 0.90 56.20 0.89 17.70 0.99 0.3 0.01 0.00 0.51 0.73 5.18 0.73 12.98 0.79 5.77 0.99 0.4 0.01 0.00 0.62 0.81 5.09 0.82 13.01 0.84 5.21 0.99 0.5 0.01 0.00 0.73 0.88 4.97 0.88 12.69 0.89 5.04 0.99 0.6 0.01 0.00 0.81 0.92 3.66 0.92 7.35 0.93 3.57 0.99 0.0 0.001 0.35 0.57 0.89 3.19 0.89 6.69 0.89 3.20 0.99 0.3 0.001 0.35 0.62 0.88 2.99 0.87 6.00 0.90 3.02 0.99 0.4 0.001 0.36 0.70 0.90 2.91 0.91 5.83 0.93 2.97 0.99 0.5 0.001 0.36 0.79 0.93 2.82 0.94 5.67 0.95 2.83 0.99 0.6 0.001 0.36 0.84 0.95 2.75 0.95 5.31 0.97 2.68 otherparameters Simulationofdifferentprobabilitieswithpc=0.99 p1 p2 pw S1 S2 S3 N3 S4 N4 S5 N5 0.9 0.0 0.01 0.00 0.08 0.90 18.2 0.89 55.62 0.90 17.58 0.9 0.3 0.01 0.00 0.54 0.76 5.14 0.76 13.08 0.79 5.64 0.9 0.4 0.01 0.00 0.67 0.85 5.05 0.85 12.92 0.87 5.12 0.9 0.5 0.01 0.00 0.77 0.91 4.86 0.90 12.10 0.91 5.02 0.9 0.6 0.01 0.00 0.86 0.94 3.57 0.93 7.24 0.95 3.57 0.9 0.0 0.001 0.36 0.57 0.90 3.18 0.89 6.61 0.89 3.19 0.9 0.3 0.001 0.36 0.65 0.88 2.98 0.89 6.34 0.92 3.02 0.9 0.4 0.001 0.36 0.73 0.92 2.87 0.92 5.91 0.94 2.91 0.9 0.5 0.001 0.35 0.81 0.94 2.81 0.94 5.51 0.96 2.82 0.9 0.6 0.001 0.37 0.88 0.96 2.72 0.96 5.24 0.97 2.90 0.99 0.0 0.01 0.00 0.09 0.98 16.9 0.98 51.40 0.98 17.6 0.99 0.3 0.01 0.00 0.58 0.83 5.66 0.83 14.73 0.87 5.69 0.99 0.4 0.01 0.00 0.69 0.90 5.43 0.91 14.38 0.92 5.69 0.99 0.5 0.01 0.00 0.80 0.94 4.61 0.94 11.19 0.95 4.73 0.99 0.6 0.01 0.00 0.87 0.96 3.26 0.97 6.26 0.96 3.35 0.99 0.0 0.001 0.35 0.62 0.98 3.20 0.98 6.32 0.98 3.28 0.99 0.3 0.001 0.36 0.69 0.94 2.90 0.94 5.88 0.97 3.00 0.99 0.4 0.001 0.36 0.76 0.95 2.89 0.96 5.59 0.98 2.85 0.99 0.5 0.001 0.36 0.83 0.97 2.70 0.97 5.48 0.98 2.80 0.99 0.6 0.001 0.36 0.89 0.98 2.68 0.98 5.19 0.99 2.69 Suppose N denotes the average number of selected nodes to be searched corresponding to S , i = 1,2,...,5. i i 9 Clearly, N =1−S because we need no search when M is selected. When event occurs and we consider only one 1 1 0 N from N , we need at most one search (since no search is needed if M is selected) and we have N ≤ 1. In max 0 2 our simulation, we find N = 1 in all the cases; that means, in simulation, M has not been selected when event 2 0 occurred. Note that N ≥ 1 since we consider all N’s in N for searching. Again, as before, N > N ≥ 1 ≥ N . 3 max 4 3 2 Also, by definition, N ≥ 1. Table 1 presents the different S ’s and N ’s based on simulation for different values of 5 i i p ,p ,p andp withp andp takingvalues 0.9and0.99,p takingvalues0.01and0.001andp takingvalues 0.0, 1 2 c w 1 c w 2 0.3, 0.4, 0.5 and 0.6. The choice of p and p reflects the corresponding high probability, whereas that of p reflects 1 c w small probability, which is desirable in a good sensor. Since the primary interest is to study the effect of detection by neighboring nodes, we consider p as 0 (which means there is no effect of neighboring nodes) and some positive 2 values less than p . 1 Note that the probability of correct detection under M depends only on p . This is also evident in Table 1. 0 w Intuitively, if p is high then the proportion S of correct detection in normal situation is low. In Table 1, we see w 1 that S is 0 for p =0.01, varies from 0.35 to 0.37 for p =0.001 and varies from 0.90 to 0.91 for p =0.0001 (not 1 w w w shown in Table 1). If we consider smaller value of p then the success probability S will be higher. Hence p must w 1 w be low as the number of hexagons is high to get better results in normal situation. We see that the estimated false negative rate, that is an estimate of P (M is selected), is often 0 in our MN 0 simulation (not shown in Table 1). This is because, if the event occurs at N, then detection of the event by at least one of the nodes belonging to {N}∪B(N) is highly probable. Furthermore, since the grid size is large, one of the node belongingto R\({N}∪B(N))mayresponsewrongly,thoughitcannotdetectthe event. So, underM ,there N is a small probability to select ROI as normal. If we take p and the detection probabilities p and p to be very w 1 2 small, then we may get some positive false negative rate but this is not a desired condition for a good sensor. From simulation, we see that, as p increases (for positive p ), S values increase whereas N decrease. As p 2 2 i i 2 increases,ithelpstodifferentiatebetweenthelikelihoodvaluesresultinginlowercardinalityofthesetN andlower max valuesofN ’s. However,sincetheneighboringnodeshelptodetecttheevent,thesuccessprobabilityincreases. From i simulation, we find that, as p increases, success probabilities also increase, but the effect of p is more prominent 1 2 than that of p . On the other hand, success probabilities also change with p and p . Since p =0 means P =p , 1 w c 2 2 w so there is little variability in the likelihood values leading to larger size of N . max When p = 0.01, effect of p on S ,S ,S and N ,N ,N seems to be significant, whereas the same cannot be w 2 3 4 5 3 4 5 said for p =0.001. There is sudden change in S ’s and N ’s, when we shift from p =0 to p =0.3, for p =0.01, w i i 2 2 w but not p =0.001. So, when p is small, the effect of the neighborhood seems to be less. w w The values of S and S are very similar for different values of the parameters; but large increment in N than 3 4 4 N suggests that the idea of neighboring search is not effective. But S is much higher than S ; so the method of 3 3 2 searching all the nodes in N is a better idea than that of searching a random node from N . max max We estimate the success probability P (L > C.L ) by simulation for different values of the threshold C MN N max ranging from 0.5 to 0.9 (see Table 2). Note that S corresponds to the threshold value C = 0.9. We consider 5 p = 0.99,p = 0.001,p = 0.9 and four values of p = 0.3,0.4,0.5,0.6. From Table 2, we see that the success 1 w c 2 probability increases as the threshold value C decreases and p increases. Similarly, the number of search decreases 2 with both C and p . 2 Table 2: Simulation of estimated success probabilities and number of searchesfor different threshold values (C) and some values of the parameters with p =p =0.9 c 1 otherparameters C=0.6 C=0.7 C=0.8 C=0.9 p2 pw success search success search success search success search 0.0 0.01 0.81 18.21 0.81 18.25 0.81 18.21 0.81 18.17 0.3 0.01 0.87 13.72 0.78 9.13 0.75 6.64 0.73 5.70 0.4 0.01 0.89 8.86 0.85 6.47 0.82 5.46 0.80 5.12 0.5 0.01 0.93 6.88 0.90 5.69 0.89 5.05 0.86 4.88 0.6 0.01 0.97 5.27 0.96 4.91 0.93 4.04 0.92 3.90 0.0 0.001 0.80 3.27 0.80 3.21 0.80 3.17 0.80 3.18 0.3 0.001 0.91 4.15 0.91 3.65 0.87 3.31 0.86 3.06 0.4 0.001 0.94 4.25 0.94 3.69 0.93 3.31 0.89 3.03 0.5 0.001 0.97 4.24 0.97 3.64 0.96 3.26 0.93 2.96 0.6 0.001 0.99 3.96 0.98 3.18 0.98 3.04 0.96 2.79 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.