ebook img

INSQ: An Influential Neighbor Set Based Moving kNN Query Processing System PDF

1 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview INSQ: An Influential Neighbor Set Based Moving kNN Query Processing System

INSQ: An Influential Neighbor Set Based Moving kNN Query Processing System Chuanwen Li∗, Yu Gu∗, Jianzhong Qi†, Ge Yu∗, Rui Zhang† and Qingxu Deng∗ ∗College of Information Science and Engineering Northeastern University Email: {lichuanwen,guyu,yuge,dengqx}@ise.neu.edu.cn †Department of Computing and Information Systems University of Melbourne 6 Victoria, Australia 1 Email: {jianzhong.qi,rui.zhang}@unimelb.edu.au 0 2 Abstract—We revisit the moving k nearest neighbor (MkNN) overhead: (i) construction overhead: the safe region needs to b query,whichcomputesone’sknearestneighborsetandmaintains be recomputed every time the kNN set is recomputed; (ii) e it while at move. Existing MkNN algorithms are mostly safe validation overhead: whether the query object is still inside F regionbased,whichlackefficiencyduetoeithercomputingsmall the current safe region needs to be checked every time the safe regions with a high recomputation frequency or computing query object location updates. 2 larger safe regions but with a high cost for each computation. In this demonstration, we showcase a system named INSQ that Existing methods either fall short in construction overhead B] adoptsanovelalgorithmcalledtheInfluentialNeighborSet(INS) or validation overhead. Specifically, earlier studies [2], [6] algorithm to process the MkNN query in both two-dimensional use Voronoi cells [6] as the safe regions, which have high D Euclidean space and road networks. This algorithm uses a small construction overhead. A more recent study [5] uses relaxed . setofsafeguardingobjectsinsteadofsaferegions.Aslongasthe s thecurrentknearestneighborsareclosertothequeryobjectthan safe regions to reduce the construction overhead, but it has c to recompute the safe regions more frequently and has higher the safe guarding objects are, the current k nearest neighbors [ stay valid and no recomputation is required. Meanwhile, the validation overhead. 2 regiondefinedbythesafeguardingobjectsisthelargestpossible A Voronoi cell based safe region is essentially an order- v saferegion.Thismeansthattherecomputationfrequencyisalso k Voronoi cell. Given a kNN set, its order-k Voronoi cell is 3 minimized and hence, the INS algorithm achieves high overall 6 query processing efficiency. defined as a region where the kNN set stays valid [6]. The 3 computationcostofcomputingorder-kVoronoicellsonthefly 0 isprohibitivelyhighsinceitrequiresacombinationofkorder- 0 I. INTRODUCTION 1 Voronoi cells. Precomputing the order-k Voronoi cells [2] is . also unpractical due to the rapid increase in the number of 2 Location-based services (LBS) are more and more popular order-k Voronoi cells as k increases. 0 as smart mobile devices have become prevalent. On-going 6 efforts have been made to improve the user experience of mo- Inthisdemonstration,weshowcaseanInfluentialNeighbor 1 bile LBS through improvements in moving query processing Set based system named INSQ. The influential neighbor set : v efficiency. In this paper we revisit a major type of moving (INS) is an algorithm to avoid the high cost of MkNN query i query, the moving k nearest neighbor (MkNN) query [1], processing based on our previous paper [3], which uses safe X [4]. Given a moving query object and a set of static data guarding objects instead of safe regions. The intuition is that, ar objects, the MkNN query reports the k nearest neighbors of since the query object moves continuously, when the kNN thequeryobjectcontinuouslywhileitismoving.Forexample, set changes, some data object near the current kNN objects aMkNNquerycanbeusedtoreportthe3nearestgasstations must become one of the new kNNs. We call this type of data continuously while one drives on a highway, or the 5 nearest objects (i.e., data objects near the current kNN objects) the pointsofinterest(POI)continuouslywhileatouristiswalking safe guarding objects. As long as no safe guarding object around a city. becomes a kNN, the current kNN set stays valid and does not need recomputation. Conceptually, the safe guarding objects Studies on the MkNN query have mainly used the safe define a safe region as large as the order-k Voronoi cell. region based approach [5], [6]. The idea is based on that Thus, they guarantee minimum kNN set recomputation and objects at nearby positions often share the same kNN set. hence minimum communication between the query client and There may be a region where all points have the same kNN the query processor, which is critical in LBS. To improve set. Inside such a region, an object is “safe” to move freely efficiency we identify a special set of safe guarding objects without causing its kNN set to change. Thus, such a region is and call it the influential neighbor set (INS), which can be calledasaferegion.Usingthesaferegionsignificantlyreduces computed and validated in time linear to k. This set reduces the MkNN query processing cost because the kNN set will both construction and validation overhead at the same time. only need to be recomputed when the query object moves out the safe region. However, the safe region also brings in some We build an interactive demonstration program that shows Copyright(cid:13)c 2016IEEE the internal mechanism of the INS algorithm. It simulates a moving query object and displays the changing kNNs as well cells denote the corresponding objects, e.g., (3,4,7) is as- as the safe guarding objects continuously. The users are given sociated to V3(p ,p ,p ). The union of the triples exclud- O 3 4 7 options to customize several parameters for information and ing O(cid:48) is 3,5,10,12. Therefore, the MIS of O(cid:48), MIS(O(cid:48))= aestheticpurposessuchasthenumberofnearestneighborsdis- {p ,p ,p ,p }. 3 5 10 12 played.Notethatinadditiontothetwo-dimensionalEuclidean In [3] we prove that MIS is minimal. However, similar spacescenariowhichwasdiscussedinourpreviouspaper[3], to computing the strict safe region (i.e., an order-k Voronoi wealsodiscussanddemonstratehowtheINSalgorithmworks cell), computing the MIS is an expensive operation. To reduce in road networks in this paper. the computational cost, we compute an influential set that is slightly larger than the MIS but can be computed much more II. INFLUENTIALNEIGHBORSET efficiently. By definition of the influential set, this will not The key idea of our MkNN query processing approach is affect the strictness in validating the kNN sets. to use a set of safe guarding objects. As long as the query The influence set we use is the influential neighbor set object is closer to the current kNNs than the safe guarding (INS), which is defined based on the Voronoi neighbors. objects, it is guaranteed that the current kNN set is still valid, and therefore, we do not need to recompute the kNN set. Definition 3: (Voronoi neighbor set) Given a Voronoi di- agram on data object set O, we call an object p a Voronoi We call the set of safe guarding objects an influential j neighbor of another object p if the two Voronoi cells of the set (IS), in the sense that they are influential in determining i two objects share an edge, i.e., V(p)∩V(p )(cid:54)=0/. We call whether a kNN set is valid. The influential neighbor set (INS) i j O(cid:48)⊂O that contains all Voronoi neighbors of p the Voronoi is a special IS that can be efficiently computed, which will be i neighbor set of p, denoted by N (p). detailed below. Now we first formalize the IS. i O i Foranorder-1Voronoidiagram,theVoronoineighborsets Definition 1: (Influential Set, IS) Given a set of data ob- can be precomputed and stored with little overhead [7]. The jects O, a query object q and a kNN set O(cid:48) ⊂O, we call a influential neighbor set is the union of the order-1 Voronoi set S⊆O\O(cid:48) an influential set of O(cid:48), denoted by S(cid:9)O(cid:48), if neighbor sets of all current kNNs. O(cid:48) stays as the kNN set of q as long as the objects in O(cid:48) are closer to q than the objects in S are, i.e., Definition 4: (Influential neighbor set, INS) Given a kNN set O(cid:48), an object p is an influential neighbor of O(cid:48) if p is not O(cid:48)=NN (q) ⇐⇒ O(cid:48)≺ S. (1) k q in O(cid:48) while it is a Voronoi neighbor of an object p(cid:48) in O(cid:48), i.e., Here, NN (q) is a function that returns the kNN set of q, for p∈/O(cid:48),∃p(cid:48)∈O(cid:48):V(p)∩V(p(cid:48))(cid:54)=0/. We call the set of all k and A≺ B denotes that any object in a set A is closer to q influential neighbors of O(cid:48) the influential neighbor set (INS) q than every object in a set B. of O(cid:48), denoted by I(O(cid:48)), p9 I(O(cid:48))=( (cid:91) NO(p(cid:48)))\O(cid:48). (3) p3 (3,6,7) p6 p(cid:48)∈O(cid:48) (6,7,12) p12 In [3] we prove that an INS is an IS, which guarantees the (3,4,7) p7 correctness of using the INS for query processing. (4,6,7) (6,7,10) p1 p11 p4 (4,7,10) III. QUERYPROCESSING p2 (4,5,7) p10 WhenaMkNNqueryisissued,wecomputeaninitial(cid:98)ρk(cid:99) p5 p8 p9 NaNsyssteetm, dpeanroatmedetbeyrtRo,baanladntcheethINeSquoefryR,reIs(uRl)t.cHomerme,uρni≥cat1ioins and recomputation costs. We call it the prefetch ratio and Fig.1. Theminimalinfluentialset(MIS)ofO(cid:48)={p4,p6,p7} have discussed how to choose its value in [3]. To improve Trivially, O\O(cid:48) is an IS of O(cid:48), since by definition the kNN the efficiency of computing I(R), we precompute the Voronoi set O(cid:48) is closer to q than O\O(cid:48). However, using this set to diagram of O and index it with an VoR-tree [7]. validate the kNN set is expensive. We aim to find the minimal We return the set R and I(R), where the top k objects of influential set, which is the smallest influential set that can R are marked as the kNN set (NN (q)) and I(R)∪R\NN (q) k k guarantee the validity of the current kNN set. are marked as the IS. Then query maintenance starts. At each Definition 2: (Minimal Influential Set, MIS) Given a kNN timestamp, we check whether the current kNN set is nearer to set O(cid:48), the minimal influential set (MIS) of O(cid:48) is qthantheIS:(i)ifitis,thenthecurrentkNNsetisstillvalid; (ii) if it is not, we perform the kNN set update procedure. If MIS(O(cid:48))=( (cid:91) O(cid:48)(cid:48))\O(cid:48). (2) there are data object updates, we also update the kNN set and Vk(O(cid:48))∩Vk(O(cid:48)(cid:48))(cid:54)=0/ the IS according to the data object updates. Here, Vk(O(cid:48)(cid:48))∩Vk(O(cid:48)) (cid:54)= 0/ means that the two order- A. kNN Set Validation k Voronoi cells are neighboring cells, i.e., they share an edge. Figure 1 gives an example, where O(cid:48)={p ,p ,p } and To validate the current kNN set, we scan the kNN set and 4 6 7 the cross-lined region denotes V3(O(cid:48)). There are five neigh- theIS.InthekNNsetwefindtheonethatisthefarthestfrom O boring order-3 Voronoi cells, as denoted by the horizontal- q, denoted by r.delete; in the IS we find the one that is the lined regions. The triples associated with the neighboring nearest to q, denoted by r.candidate. If r.candidate is closer toqthanr.delete,thecurrentkNNsethasbecomeinvalidand by b in the figure. This mid-point satisfies d(b,p )=d(b,p ); 7 8 we need to update it. nootherobjectinO\O \I(O )isnearertobthan p or p . knn knn 7 8 Here, d() denotes the shortest road network distance between B. kNN Set Update two objects. The existence of this mid-point indicates that p and p are also order-1 Voronoi neighbors, which means 7 8 TherearetwocasesforkNNsetupdate:(i)Ifqhasentered that p should also belong to I(O ). Thus, we have shown 8 knn aneighboringorder-kVoronoicell,thenewkNNsetwilldiffer that every object in MIS(O ) also belongs to I(O ), i.e., knn knn from the current one by only one data object. In this case, we MIS(O )⊆I(O ). knn knn do not need to recompute the entire kNN set but can use the existing kNN set to compose the new kNN set. (ii) If q is not TheabovetheoremallowstovalidatethekNNsetusingthe in a neighboring order-k Voronoi cell, we first check whether INS in road networks. However, in road networks, checking thenewkNNsetisstillinR.Ifyes,thenwejustneedtoreturn whether the query object is nearer to any object in the INS this new kNN set. If not, we compute the new sets of R and than the objects in the kNN set is not a trivial operation. This I(R) and return the new kNN set and IS. is because it requires network distance (shortest path) compu- tation. To constraint the search space of this computation, we have proven the following theorem. IV. INSINROADNETWORKS We extend the influential neighbor set to road networks Theorem 2: Given a Voronoi diagram DO on a road net- in this demonstration paper. We consider a planar undirected work with a set of data objects O, a set Oknn⊂O and its INS (connected) graph G=(cid:104)V,E(cid:105) formed by a set of verticesV = I(Oknn), and a Voronoi diagram DOknn∪I(Oknn) formed by the {v ,v ,...,v }andasetofedgesE={e ,...,e}.Weassume edges and vertices from the Voronoi cells of the objects in 1 2 m 1 l that the data objects O={p1,...,pn} are all at the vertices Oknn and I(Oknn), if the kNN set of a query object q is Oknn (otherwise we can add them to the set of vertices and add on DOknn∪I(Oknn), then the kNN set of q on DO is also Oknn. edges accordingly). This theorem guarantees that we just need to consider the (smaller) road network formed by the current kNN set v1 p4 v7 p8 and the INS for query validation. We omit the detailed proof (4,7) p7 v12(7,8) due to space limit. Based on the two theorems above, we p1 v3 v5 v8b0v11b v13 hisavsheodwecvaesleodpeidn tthhee fIoNllSowailnggorsitehcmtiofno.r road networks which v9 (6,7) v14 p3 v6 p6 V. DEMONSTRATION (6,9) (5,6) The system1 is implemented as a Scala Swing application. The application runs in two modes, Road Network mode and v2 p2 v4 p5 v10 p9 2D Plane (Euclidean space) mode. Its user interface consists of two panels (cf. Figure 3): Fig.2. Aroadnetworkexample (i)Controlpanel.Thisistheleftpaneloftheuserinterface. The concept of Voronoi diagram still applies in road It is used for setting the demonstration parameters, which networks. As such, we can show that the influential neighbor include the Global Setting, the Road Network setting and the setdefinedbyDefinition4isstillasupersetoftheMISdefined 2D Plane setting. by Definition 2 in road networks. The global setting includes the underlying map to be used, Theorem 1: In a road network, given a kNN set Oknn, thedataspacetobeused,andthevalueofthequeryparameter I(Oknn) as defined by Definition 4, and MIS(Oknn) as defined k. There are also two buttons “Save” and “Read” which are by Definition 2: for recording and loading the demonstration settings. MIS(Oknn)⊆I(Oknn). (4) The road network setting includes options to control the operations being performed: “Node”, “Site”, “Trajectory”, and The detailed proof is omitted due to space limit. We use “Demo”.Whenoneofthefirstthreeoptionsischosen,itallows Fig. 2 to illustrate the idea. The figure shows an order-2 userstoadd/move/deletethenetworknodes/dataobjects/query Voronoi diagram in a road network. We denote the vertices trajectory. When “Demo” is chosen, a MkNN query will be with data objects by solid dots and the rest of the vertices simulated. The setting also includes options to control which by hollow dots, respectively. We denote the edge segments objectstobeshownaswellasthequeryobject’smovingspeed that belong to the same order-2 Voronoi cell by the same in the simulation. line type. There is a number pair, e.g., (6,7), besides these edge segments, meaning that these edge segments belong to The 2D plane setting includes the number of data objects an order-2 Voronoi cell, e.g.,V2(p ,p ). togenerate(n),thevalueoftheprefetchratio(ρ),andwhether 6 7 to show the Voronoi cells and the corresponding safe regions. Let the current kNN set be O ={p ,p }. The MIS by knn 6 7 definition should be MIS(Oknn) = {p4,p5,p8,p9}. For each Under these settings, the current kNN set and the INS are pair of objects (p,p(cid:48))∈Oknn×MIS(Oknn), e.g., (p7,p8), we displayed. can find a “mid-point” on the shortest path between p and 7 p8. For example, the mid-point between p7 and p8 is denoted 1Sourcecodeavailableathttps://github.com/chiewen/CkNN.git (a) ThekNNsetisvalid Fig.3. Screenshots(RoadNetwork,k=5) (b) ThekNNsetisinvalid (ii)Mainpanel.Thisistherightpaneloftheuserinterface. Fig.4. Screenshots(2Dplane,k=5andρ=1.6) Itdisplaysamapofaroadnetworkwheredataobjects(orange dots) and the query object (red dot) can be placed onto. Users based on a highly efficient algorithm INS. The program can specify a trajectory for the query object. In the Road simulates the movement of the query object along the trajec- Network mode, this trajectory must confine the underlying tory and displays the changing kNNs and the corresponding road network; in the 2D Plane mode, the trajectory can have influential neighbors which are the key feature of the INS anyshape.Whenquerysimulationstarts,thequeryobjectwill algorithm. Users can customize the behavior of the simulation movealongthespecifiedtrajectory.TheINSalgorithmwillrun and observe the mechanisms of the algorithm in detail. The to compute the kNN set and the INS set, which are shown as source code of the INS algorithm and the INSQ system is the green dots and the yellow dots, respectively. publicly accessible through the Internet, which can be easily adapted to real systems such as smart phone applications. IntheRoadNetworkmode,theVoronoicellsoftheobjects inthekNNsetandtheINSarerepresentedbythesetsofgreen and yellow network edges, respectively. ACKNOWLEDGEMENTS This work is supported by the National Natural Science In the 2D Plane mode (Fig. 4), the data objects are Foundation of China under Grant No. 61300021, 61472071, surroundedbytheirrespectiveorder-1Voronoicells.Theones 61472072, 61402155, the National Basic Research Program enclosed by green squares represent the objects in R. We use of China (973 Program) under Grant No. 2012CB316201 thecyanpolygontorepresentthecurrentorder-k Voronoicell, and 2014CB360509, the Fundamental Research Funds for the which will turn red when it becomes invalid. We circle two Central Universities of China No. N140404008. Jianzhong Qi special objects, the farthest object to q in the kNN set and the is the recipient of a Melbourne School of Engineering Early nearest object to q in the INS, with a green circle and a red CareerResearcherGrant(projectreferencenumber4180-E55), circle passing through them, respectively, where the center of and a University of Melbourne Early Career Researcher Grant bothcirclesareatq.Thesetwoobjectsarespecialinthesense (project number 603049). Rui Zhang is supported by ARC that the green circle should always enclose all the objects in Future Fellow project FT120100832. the kNN set, while the red circle should never enclose any object in the INS, as long as the current kNN set is still valid. REFERENCES Demonstrated Scenarios: We take the 2D Plane mode as [1] T. Hashem, L. Kulik, and R. Zhang. Countering overlapping rectangle an example. Fig. 4 displays two screenshots of the demonstra- privacyattackformovingknnqueries. InformationSystems,38(3):430– tion program, where each shows: (i) the query object stays in 453,2013. the order-k Voronoi cell of the current kNN set (i.e., the kNN [2] L.KulikandE.Tanin.Incrementalrankupdatesformovingquerypoints. set is valid), and (ii) the query object has moved out of the InGeographicInformationScience,pages251–268.2006. order-k Voronoi cell (i.e., the kNN set is invalid). [3] C.Li,Y.Gu,J.Qi,G.Yu,R.Zhang,andW.Yi.Processingmovingknn queriesusinginfluentialneighborsets. PVLDB,8(2):113–124,2014. When the query object moves out of the order-k Voronoi [4] T. Nguyen, Z. He, R. Zhang, and P. Ward. Boosting moving object cell shown in Fig. 4 (a) and moves into the position shown in indexingthroughvelocitypartitioning. PVLDB,5(9):860–871,2012. Fig.4(b),thecurrentkNNsetbecomesinvalidsincethegreen [5] S.Nutanong,R.Zhang,E.Tanin,andL.Kulik.Thev*-diagram:aquery- circleisnowenclosedbytheredcircle(i.e.,thefarthestobject dependent approach to moving knn queries. PVLDB, 1(1):1095–1106, to q in the kNN set is farther to q than the nearest object to q 2008. in the INS). [6] A.Okabe,B.Boots,K.Sugihara,andS.N.Chiu. Spatialtessellations: conceptsandapplicationsofVoronoidiagrams,volume501.Wiley.com, VI. CONCLUSION 2009. [7] M.SharifzadehandC.Shahabi. Vor-tree:R-treeswithvoronoidiagrams We built and showcased a system named INSQ to process forefficientprocessingofspatialnearestneighborqueries. PVLDB,3(1- the MkNN query on both Euclidean space and road networks, 2):1231–1242,2010.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.