ebook img

Distributed Image Search in Sensor Networks - Home | UMass Amherst PDF

14 Pages·2008·0.86 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Distributed Image Search in Sensor Networks - Home | UMass Amherst

Distributed Image Search in Sensor Networks TingxinYan,DeepakGanesan,andR.Manmatha {yan,dganesan,manmatha}@cs.umass.edu DepartmentofComputerScience UniversityofMassachusetts,Amherst,MA01003 Abstract security systems [10], monitoring old age homes [2], etc. Recent advances in sensor networks permit the use of a In addition to camera sensor networks, the availability of largenumberofrelativelyinexpensivedistributedcomputa- cameras on almost all cellphones available today presents tional nodes with camera sensors linked in a network and tremendous opportunities for “urban image sensing”. Mo- possiblylinkedtooneormorecentralservers. Wearguethat bilephone-centricapplicationsincludemicroblogging([7]), thefullpotentialofsuchadistributedsystemcanberealized telemedicine(e.g. dietdocumentation[26]),andothers[13]. ifitisdesignedasadistributedsearchenginewhereimages Whilecamerasensornetworkspresentmanyexcitingap- fromdifferentsensorscanbecaptured,stored,searchedand plicationopportunities,theirdesignischallengingduetothe queried. However, unlike traditional image search engines size of images captured by a camera. In contrast to sensor thatarefocusedonresource-richsituations,theresourcelim- networkdatamanagementsystemsforlowdata-ratesensors itationsofcamerasensornetworksintermsofenergy,band- such as temperature that can continually stream data from width, computational power, and memory capacity present sensors to a central data gathering site, transmitting all but significant challenges. In this paper, we describe the de- asmallnumberofimagesisimpracticalduetoenergycon- signandimplementationofadistributedsearchsystemover straints. Forexample,transmittingasingleVGA-resolution acamerasensornetworkwhereeachnodeisasearchengine imageoveralow-powerCC2420wirelessradiotakesupto that senses, stores and searches information. Our work in- afewminutes,therebyincurringsignificantenergycost. An volvesinnovationatmanylevelsincludinglocalstorage,lo- alternate approach to design camera sensor networks is to calsearch,anddistributedsearch,allofwhicharedesigned useimagerecognitiontechniquestoidentifyspecificentities to be efficient under the resource constraints of sensor net- of interest so that only images matching these entities are works. We present an implementation of the search engine transmitted. Much of the work on camera sensor networks on a network of iMote2 sensor nodes equipped with low- hasemployedsuchanapproach(e.g.: [18]). powercamerasandextendedflashstorage. Weevaluateour Instead of deploying such application-specific camera system for a dataset comprising book images, and demon- sensornetworks,wearguethatthereisaneedforageneral- strate more than two orders of magnitude reduction in the purpose image search paradigm that can be used to recog- amountofdatacommunicatedandupto5xreductioninover- nizeavarietyofobjects,includingnewtypesofobjectsthat allenergyconsumptionoveralternatetechniques. might be detected. This can enable more flexible use of a camera sensor network across a wider range of users and 1 Introduction applications. For example, in a habitat monitoring camera Wireless camera sensor networks — networks compris- sensor network, many different end-users may be able to inglow-powercamerasensors—havereceivedconsiderable usea singledeployment ofcamera sensorsfor theirdiverse attention over recent years, as a result of rapid advances in goals including monitoring different types of birds or ani- camera sensor technologies, embedded platforms, and low- mals. Tworecenttechnologytrendsmakeacompellingcase power wireless radios. The ability to easily deploy cheap, forasearch-basedcamerasensornetwork. Thefirsttrendis battery-powered cameras is valuable for a variety of appli- recentadvancesinimagerepresentationandretrievalmakes cations including habitat monitoring [22], surveillance [8], it possible to efficiently compute and store compact image representations, and efficiently search through them using techniques modified from text retrieval [3]. These methods use descriptive features like SIFT [17] to characterize im- ages,whicharethenmappedtovisualwordsorvistermsus- ingefficientclusteringtechniques[6,30]. Thedatabasesim- ages may then be represented using visterms, and searched Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalor by indexing the visterms using an inverted index [24, 25]. classroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributed forprofitorcommercialadvantageandthatcopiesbearthisnoticeandthefullcitation Second,recentstudieshaveshownthatNANDflashstorage onthefirstpage.Tocopyotherwise,torepublish,topostonserversortoredistribute is two to three orders of magnitude cheaper than commu- tolists,requirespriorspecificpermissionand/orafee. nicationoverlow-powerCC2420radioscommonlyusedon sensornodes[21]. Inaddition,pricesforflashmemorystor- is a novel distributed image search engine that unifies agehaveplummetedandcapacitieshavesoared. Thus, itis the local search capabilities at individual nodes into a nowpossibletoequipsensorswithseveralGigabytesofstor- networkedsearchengine. Oursystemenablesseamless agetostoreimageslocally,ratherthantransmitthemovera merging of scores from different local search engines wirelessnetwork. acrossdifferentsensorstogenerateaunifiedrankedlist The ability to perform “search” over a sensor network in response to queries. In addition, we show that such alsoprovidesanaturalandrichparadigmforqueryingsen- a distributed search engine enables a user to query a sor networks. Although there has been considerable work sensor network in an energy-efficient manner using an on energy-efficient query processing strategies, their focus iterativeprocedureinvolvingthecommunicationoflo- has been on SQL-style equality, range, or predicate-based calscores,representationsandimagestoreduceenergy queries(e.g. rangequeries[5],minandmax[29];countand consumption. Our results show that such a distributed avg[19],median,andtop-k[28];andothers[9]).Theclosest image search is up to 5x more efficient than alternate worktooursisontextsearchinadistributedsensornetwork approaches, while incurring reasonable increase in la- [31, 32]. However, their work is specific to text search and tency(lessthanfoursecondsinafour-hopnetwork). assumesthatuserscanannotatetheirdatatogeneratesearch- • Application Case Studies: We evaluate and demon- able metadata. In contrast to these approaches, we seek to strate the performance of our distributed image search havearicherandautomatedmethodsforsearchingthrough engineinthecontextofanindoorbookmonitoringap- complexdatatypessuchasimages,andtoenablepost-facto plication. Weshowthatoursystemachievesupto90% analysisofarchivedsensingdatainsuchsensornetworks. accuracyforsearch. Wealsoshowhowsystemparam- Whiledistributedsearchincamerasensornetworksopens eterscanbetunedtotradeoffqueryaccuracyforenergy upnumerousnewopportunities, italsopresentsmanychal- efficiencyandresponsetime. lenges. First, the resource constraints of sensor platforms 2 Background necessitate efficient approaches to image search in contrast totraditionalresource-intensivetechniques. Second,design- Beforepresentingthedesignofoursystem,wefirstpro- ing an energy-efficient image search system at each sensor vide a concise background on the state of art image search necessitates optimizing local computation and local storage techniques, and identify the major challenges in the design onflash. Finally,distributedsearchacrosssuchanetworkof of an embedded image search engine for sensor networks. camera sensors requires ranking algorithm to be consistent Image search involves three major steps: (a) extraction of across multiple sensors by merging query results. In addi- distinguishing features from images, (b) clustering features tion, there is a need for techniques to minimize the amount togeneratecompactdescriptorscalledvisterms,(c)ranking ofcommunicationincurredtorespondtoauserquery. matchingresultsforaquery,basedonaweightedsimilarity measurecalledtf-idfranking[3]. Inthispaper,wedescribeanovelgeneraldistributedim- age search architecture comprising a wireless camera sen- 2.1 ImageSearchOverview sor network where each node is a local search engine that ImageFeatures: Anecessarypre-requisiteforperform- senses, storesandsearchesimages. Thesystemisdesigned ing image search is the availability of distinguishing image toefficiently(bothintermsofcomputationandcommunica- features. While such distinguishing image features are not tion)mergescoresfromdifferentlocalsearchenginestopro- availableforallkindsofimagetypesandrecognitiontasks, duceaunifiedglobalrankedlist. Oursearchengineismade several promising techniques have emerged in recent years. possiblebytheuseofcompactimagerepresentationscalled Inparticular,whenoneisinterestedinsearchingforimages visterms for efficient communication and search, and the ofthesameobjectorscene,agoodrepresentationisobtained re-design of fundamental data structures for efficient flash- usingtheScale-InvariantFeatureTransformalgorithm(SIFT based storage and search. Our system is implemented on a [17]),whichgenerates128dimensionalvectorsbycomput- network of iMote2 sensor nodes equipped with the Enalab inglocalorientationhistograms. SuchaSIFTvectoristypi- cameras[1]andcustombuiltSDcardboards. Ourworkhas callyagooddescriptionofthelocalregion. thefollowingkeycontributions: Visterms or Visual Words: While search can be per- formed by directly comparing SIFT vectors of two images, • Efficient Local Storage and Search: The core of our thisapproachisveryinefficient.SIFTvectorsarecontinuous system is an image search engine at each sensor node 128dimensionalvectorsandthereareseveralhundredSIFT that can efficiently search and rank matching images, vectorsforaVGAimage. Thismakesitexpensivetocom- andefficientlystoreimagesandindexesofthemonlo- pute a distance measure for determining similarity between cal flash memory. Our key contributions in this work images. State-of-art image search techniques deal with this is the re-design of two fundamental data structures in problembyclusteringimagefeatures(e.g. SIFTvectors)us- animagesearchengine—thevocabularytreeandthe ing an efficient clustering algorithm such as hierarchical k- inverted index — to make it efficient for flash-based means [24], and by using each cluster as a visual word or storage on resource-constrained sensor platforms. We visterm,analogoustoawordintextretrieval[30]. showthatourtechniquesimprovetheenergyconsump- The resulting hierarchical tree structure is referred to as tion and response time of performing local search by the vocabulary tree for the images, where the leaf clusters 5-6xoveralternatetechniques. formthe“vocabulary”usedtorepresentanimage[24]. The • Distributed Image Search: Our second contribution vocabularytreecontainsboththehierarchicaldecomposition and the vectors specifying the center of each cluster. Since quent. Thus,oursecondchallengeis: Howcanwede- thenumberofbitsneededtorepresentthevocabularyisfar signanupdate-optimizedflash-basedinvertedindexfor smallerthanthenumberofbitsneededtorepresenttheSIFT sensorplatforms? vector,thisrepresentationisveryefficient. Wereplaceeach • DistributedSearch:Existingimagesearchenginesare 128byteSIFTvectorwitha4bytevisterm. designedundertheassumptionthatalldataisavailable Matching: Image matching is done by comparing vis- centrally. Inasensornetwork, eachnodehasaunique termsbetweentwoimages. Iftwoimageshavealargenum- and different local image database, therefore, we need ber of visterms in commmon they are likely to be similar. toaddressquestionsaboutmergingresultsfrommulti- Thiscomparisoncanbedonemoreefficientlybyusingadata ple nodes. In addition, sensor network users can pose structurecalledtheinvertedindexorinvertedfile[3],which differenttypesofqueries—continuousandad-hoc— provides a mapping between a visterm and all images that whichneedtobeefficientlyprocessed. Thus,ourthird containthevisterm. Oncetheimagestocomparearelooked challenge is: How can we perform efficient and ac- up using the inverted index, a query image and a database curate distributed search across diverse types of user imagecanbematchedusingvistermsbyscoringthem. Asis queriesanddiverselocalsensordatabases? commonintextretrieval,scoringisdonebyweightingmore Thefollowingsectionsdiscussesouroverallarchitecture common visterms less than rare visterms [3, 24, 25]. The followed by techniques employed by our system to address rationaleisthatifavistermoccursinalargenumberofim- thespecificproblemsthatweoutlinedinthissection. ages,itispooratdiscriminatingbetweenthem. Theweight for each visterm is obtained by computing the tf-idf score 3 Image Search Architecture (termfrequency-inversedocumentfrequency)asfollows: Wedescribethearchitectureofanimagesearchenginefor sensornetworksinthecontextofatieredframework. There are three tiers in our system: (i) applications and end-users tf = Freq. ofvistermvinanimage v that pose search queries on the sensor network, (ii) sensor dfv = Num. ofimagesinwhichvistermvoccurs proxiesthatactasthegatewaybetweentheInternetandthe Totalnumofimages sensor field, and monitor and manage the sensor network, idf = v df and (iii) a wireless network of remote camera sensors that v storedatalocally, andhavethecapabilitytoperformimage score = ∑tf ·idf (1) i i search in response to a query. Image search over a sensor i networkinvolvesthefollowingsteps: where the index i is over visterms common to the query Initialization: The sensor search engines are boot- anddatabaseimageanddfvdenotesthedocumentfrequency strapped during deployment time by pre-loading vocabu- ofv.Oncethematchingscoreiscomputedforallimagesthat lary trees that have been constructed offline using a train- have a visterm in common with the query image, the set of ing dataset. The vocabulary trees at different sensors may imagescanberankedaccordingtothisscoreandpresented be different for recognizing different entities. Our system totheuser. also supports dynamic updates of the vocabulary tree to re- 2.2 ProblemStatement flect changes in application search needs, or updates in the There are many challenges in optimizing such an image vocabularytreebasedonmoretrainingdata. searchsystemfortheenergy,computation,andmemorycon- User Querying: Users of a sensor network can pose straints of sensor networks. We focus on three key chal- archival queries or continuous queries over the sensor net- lengesinthispaper: work. Anexampleofanarchivalqueryinahabitatmonitor- ing application would be: “Retrieve the top 5 matches of a • Flash-based Vocabulary Tree: The vocabulary tree Hawk in the last three months”. If the user is looking for a datastructureistypicallyverylargeinsize(manytens newindividualofthespeciesoradifferenttypeofbird,they ofMB)andneedstobemaintainedonflash. Whilethe canalsoprovideanimageofthetypeofbirdthattheyarein- data structure is static and is not modified once con- terestedinsearching. Theproxyprocessesthequeryimage structed, it is heavily accessed for lookups since every to extract the visterms from the query image, and transmits conversion of an image feature to visterm requires a thequeryaswellastheimagevistermstoeachsensorinthe lookup. Thus, our first challenge is: How can we de- network. These visterms are considerably smaller than the signalookup-optimizedflash-basedvocabularytreein- originalqueryimage,hencetheyaremuchmoreefficientto dexstructureforsensorplatforms? communicate. Oursystemalsosupportscontinuousqueries • Flash-basedInvertedIndex: Asecondimportantdata using the same framework. An example of a continuous structureforsearchistheinvertedindex. Asthenum- queryinahabitatmonitoringapplicationwouldbe: “Trans- berofimagescapturedbyasensorgrows,theinverted mit images of any bird that matches American Robin”. In indexwillgrowtobelargeandneedstobemaintained thiscase,thesensorsarecontinuallymonitoringnewimages on flash. Unlike the vocabulary tree, the inverted file toseeifanewimagematchesaquery. is heavily updated since an insertion operation occurs Local search: The local image search engine handles foreveryvistermineveryimage. Incontrast,thenum- continuousandarchivalqueriesindifferentways.Inthecase ber of lookups on the inverted index depends on the ofanarchivalquery,thevistermsoftheimageofinterestto query frequency, and can be expected to be less fre- thequeryarematchedagainstthevistermsofalltheimages Hierarchical Clustering User query Query response Proxy Re-tasking Global module Search Cluster Centroids Cluster Hierarchical Tree Figure2. K-meansclusteringtoconstructvocabularytree Push scores/ Feedback vocabulary Send new representation if updates; request image query to sensed image representations or sensor matches query thumbnails structure,thatisoptimizedforflash-basedlookups. 4.1 Description The vocabulary tree at each sensor is used to map SIFT new images Local Local vectorsextractedfromcapturedimagestotheircorrespond- Store Search ingvistermsthatareusedforsearch. Thevocabularytreeis Sensor typically created using a hierarchical k-means clustering of alltheSIFTfeaturesintheimagedatabase[24].Hierarchical Figure1. SearchEngineArchitecture k-meansassumesthatthedataisfirstclusteredintoasmall number of m clusters. Then at the next level, the points in eachofthemclustersareclusteredintoafurthermclusters that are stored in the local image store. The results of the sothatthislevelhasm2 clusters. Theprocessisrepeatedso local search is a top-k ranked list of best matches, which is that at a depth d the number of clusters is md. The process transmitted tothe proxy. In thecase of acontinuous query, stops when the desired number of leaf clusters is reached. eachcapturedimageismatchedagainstthespecificimages Forexamplewithm=10andd =6thereareamillionclus- ofinteresttothequery,andifthematchisgreaterthanapre- tersattheleaflevel. Theclustersattheleaflevelcorrespond definedthreshold,thecapturedimageisconsideredamatch to visual words or visterms. The ID of the leaf node is the and transmitted to the proxy. In both cases, simple filter- IDofthevisterm; forexample, ifaSIFTvectorismatched ingoperationssuchasmotiondetection[15]orcolor-based to the second cluster in the vocabulary tree, its visterm ID recognition [33] may be performed prior to performing the is 2. The resulting vocabulary tree is shown in Figure 2 — searchin-ordertosaveprocessingandstorageresources. each level has a number of clusters, where each cluster is Global search: Once the local search engine has representedbythecoordinateoftheclustercenter,aswellas searched for images matching the search query, the proxy pointersfromeachnodetoitsparent,children,andsiblings. andthesensorinteracttoenableglobalsearchacrossadis- Lookupofthetreeproceedsinahierarchialmannerwhere tributed sensor network. This interaction is shown in Fig- theSIFTvectorisfirstcomparedwiththemclustercentersat ure 1. Global search involves combining results from mul- thetoplevelandassignedtotheclusterwiththeclosestcen- tiple local search engines at different sensors to ensure that teratthislevelck. Then, theSIFTvectoriscomparedwith communication overhead is minimized across the network. the centers of the siblings of ck and assigned to the closest For instance, it is wasteful if each sensor transmits images siblingckj. TheprocessrepeatsuntiltheSIFTvectorisas- correspondingtoitslocaltop-kmatchessincethelocaltop- signedtoaleafnodeorvisterm. kmatchesmaynotbethesameastheglobaltop-kmatches 4.2 DesignGoals to a query. Thus, the search process proceeds in multiple Thedesignofthevocabularytreehastwokeyobjectives: rounds where each step involves transmission to the proxy, • Minimize tree construction cost: The process of con- and feedback from the proxy. In the first round, the proxy structing the vocabulary tree for a set of database im- gets the ranking scores corresponding to the top-k matches ages is extremely computationally intensive, and can resultingfromthelocalsearchprocedureateachsensor.The consumemanyhoursonaresource-limitedsensornode. proxy then merges the scores obtained from different sen- Thus,whileconventionalsearchenginescreateavocab- sors to generate a global top-k list of images for the next ularytreefromalltheimagesthatneedtobesearched, round,whichitcommunicateswithappropriatesensors. The such an approach is next to impossible on embedded sensorsthentransmitthumbnailsorthefullimagesofthere- sensorplatforms. Thus,ourfirstgoalistominimizethe questedimages,whicharepresentedtotheuserusingaGUI. costoftreeconstruction. 4 Buffered Vocabulary Tree • Minimize Flash reads: The vocabulary tree is a large Thevocabularytreeisthedatastructureusedateachsen- datastructure(fewMB),hence,thedatastructureneeds sortomapimagefeatures(forexampleSIFTvectors)tovi- tobemaintainedonflash. However, readingtheentire sualtermsorvisterms. Wenowprovideanoverviewofthe datastructurefromflashwhenanewSIFTvectorneeds operationofthevocabularytree, anddescribethedesignof to be looked up consumes considerable latency, which anovelindexstructure,theBufferedVocabularyTreeindex inturnconsumesconsiderableenergybothforkeeping ingthenumberofvistermsandconsequentlythesizeofthe Buffered visterms Subtree-1 1 v11 v12 v13 v14 v15 v21 v22 v23 tree. 4.4 BufferedLookup 2 3 ... 17 ThekeychallengeinmappingaSIFTvectortoavisterm onasensornodeisminimizingtheoverheadofreadingthe vocabularytreefromflash. Anaiveapproachthatreadsthe Subtree-2 18 v11 v13 v14 v23 Subtree-17 v12 v15 v21 vocabularytreefromflashasandwhenrequiredisextremely ... inefficientsinceitneedstoreadalargechunkofthevocab- ularytreeforpracticallyeverysinglelookup. ... ... Themainideainourapproachistoreducetheoverhead ofreadsbyperformingbatchedreadsofthevocabularytree. Figure3. VocabularyTree Sincethevocabularytreeistoolargetobereadintomemory asawhole,itissplitintosmallersub-treesasshowninFig- ure 3, and each subtree is stored as a separate file on flash. the processor switched on, and for reading from flash. The subtree size is chosen such that it fits within the avail- Therearetwocomponentsofaflashreadcost: (a)ev- ablememoryinthesystem. Therefore,theentiresubtreefile eryflashreadoperationincursafixedcostinamanner canbereadintomemory.Second,webatchtheSIFTvectors similartodisks,and(b)thereisaper-bytecostassoci- fromasequenceofcapturedimagesintoalargein-memory atedwithbytesreadfromflash. Hence,oursecondgoal SIFT buffer. Once the SIFT buffer is full, the entire buffer istominimizethenumberofreadsfromflashforevery is looked up using the vocabulary tree one level at a time. lookupofthevocabularytree. Thus, the root subtree (subtree 1 in Figure 3) is first read fromflash,andtheentirebatchofSIFTvectorsislookedup 4.3 Proxy-basedTreeConstruction ontherootsubtree. Thislookupdetermineswhichnextlevel Unlikeconventionalsearchengineswherethevocabulary subtree needs to be read for each SIFT vector, and results treeiscreatedfromalltheimagesthatneedtobesearched, inasmallersetofsecond-levelbuffers. Thenextlevelsub- ourapproachseparatestheimagesusedfortreeconstruction trees are looked up one by one, and the batch processed on fromthoseusedforsearch. Theproxyconstructsthevocab- eachofthesesub-treestogeneratethird-levelbuffers,andso ulary tree from images similar (but not necessarily identi- on. The process proceeds in a level by level manner, with cal)tothoseweexpecttocapturewithinthesensornetwork. asubtreefilebeingread,andabufferofSIFTvectorsbeing Forexample,inabooksearchapplication,atrainingsetcan looked up at each level. Such a buffered lookup on a seg- comprise a variety of images of book covers for generating mented vocabulary tree ensures that the cost of reading an the vocabulary tree at the proxy. The images can even be entire vocabulary tree is amortized over a batch of vectors, captured using a different camera with different resolution therebyreducingtheamortizedcostperlookup. from the deployed nodes since the techniques to construct 5 Inverted Index thetreearerobusttochangesinthecamera[17]. Oncecon- structed, the vocabulary tree can either be pre-loaded onto While the vocabulary tree is used to determine how to each sensor node prior to deployment (this can be done by mapfromSIFTfeaturevectorstovisterms,aninvertedindex physically plugging in the flash drive to the proxy and then (alsoknownastheinvertedfile)asintextretrieval[3]isused copyingit),orcanbedynamicallytransmittedtothenetwork to map a visterm to the set of images in the local database duringsystemoperation. that contain the visterm. The inverted index is used in two casesinoursystem. First,whenanimageisinsertedintothe One important consideration in constructing the vocabu- local database, the visterms for the image are inserted into larytreeforsensorplatformsisensuringthatitssizeismin- theinvertedindex; thisenablessearchbyvisterms,wherein imal. Previous work in the literature has shown that using thevistermscontainedinthequeryimagecanbematchedto largervocabularytreesproducesbetterresults[24,25]. This thevistermscontainedinthestoredlocallycapturedimages. work is based on using trees with a million leaves or more Second, the inverted index is also used to determine which which is many Gigabytes in size. Such a large vocabulary imagestoagewhenflashisfilleduptomakeroomfornew treepresentsthreeproblemsinasensornetworkcontext: (a) images. Aging of images proceeds by first determining the theyarelargeandconsumeasignificantfractionofthelocal imagesthatareleastlikelytobeusefulforfuturesearch,and flash storage capacity, (b) they incur greater access time to deletingthemfromflash. read,therebygreatlyincreasingthelatencyandenergycon- sumption for search, and (c) they would be far too large to 5.1 Description dynamicallycommunicatetoupdatesensornodes.Toreduce The inverted index is updated for every image that is thesizeofthevocabularytree, ourdesignreliesonthefact stored in the database. Let the sequence of visterms con- thattypicalsensornetworkshaveasmallnumberofentities tainedinimageI beV =v ,v ,...,v . Then,theentryI is i i 1 2 k i ofinterest(e.g.: afewbooks, birds, oranimals), hence, the inserted into the inverted index entry for each of these vis- vocabularytreecanbesmallerandtargetedtowardssearch- terms contained in V. Figure 4 shows an example of an i ingfortheseentities. Forexample,inabookcovermonitor- inverted index that provides a mapping from the visterm to ingapplication,afewhundredsofbookscanbeusedasthe thesetofimagesinthelocaldatabasethatcontaintheterm, training set to construct the vocabulary tree, thereby reduc- aswellasthefrequencyoftheappearanceofthetermacross all images in the local database. Each entry is indexed by Visterm ID DF Offset Image IDs the Visterm ID, and contains the document frequency (DF) v1 20 o1 I35, I212, I354, ... top-vk array score of the visterm (Equation 1), and a list of Image IDs. v2 36 o2 I5, I7, I15, I235, ... v20 We do not store the term frequency (TF) numbers per im- v3 15 o3 I76, I81, I67, I99, I320, ... v33 age in the interest of saving memory space. For efficiency v4 27 o4 I6, I26, I43, I57, I87, ... 894 ... reasonswecomputetheinverteddocumentfrequency(IDF) ... ... Array of longest scoreatquerytimefromtheDF.Asweshowlater,theIDF Inverted Index in Memory visterm lists scoresaresufficientforgoodperformanceinoursystem. The inverted index facilitates querying by visterms. Let o4 o2 o3 o1 the set of visterms in the query image be Q=q1,q2,...,qn. I2, I21, I33 prev_offset ... Each of these visterms is then used to look up the inverted index and the corresponding inverted list for the visterm Log-structured Flash Storage returned. Thus for query visterm q, a list of image IDs i Figure4. InvertedIndex.DFdenotesdocumentfrequency L =I ,...,I is returned, where each element of the list is i i1 ik animageID.Thelistsoverallthequeryvistermsareinter- sected and scored to obtain a rank ordering of the images howitisupdatedwhenanewimageiscaptured,and(c)how withrespecttothequery. entriesaredeletedwhenimagesareremovedfromflash. 5.2 DesignGoals Storage: The inverted index is maintained on flash in a Unlikethevocabularytreewhichisastaticdatastructure log-structuredmanner, i.e. insteadofupdatingapreviously thatisneverupdated,theinvertedindexisupdatedforevery written location, data is continually stored as an append- visterminacapturedimage,henceitneedstobeoptimized only log. Within this log, the sequence of image IDs cor- for insertions. The design of the flash-based inverted index responding to each visterm is stored as a reverse linked list structureisinfluencedbythefollowingcharacteristicsofthe of chunks, where each chunk has a set of image IDs and a flash layer below and from the search engine layer above, pointertothepreviouschunk. Forexample,inFigure4,the andhasthefollowinggoals: set of Image IDs corresponding to visterm v is stored in 1 • Minimizeflashoverwrites: Flashwritesareimmutable flash as two chunks, with a pointer from the second chunk andone-time—oncewritten,adatapagemustbeerased tothefirstchunk. Themainbenefitofthisapproachisthat before it can be written again. The smallest unit that each write operation becomes significantly cheaper since it canbeerasedonflash,termedanerase-block,typically only involves an append operation on flash, and avoids ex- spansfewtensofpages,whichmakesanyin-placeover- pensiveout-of-placerewrites. Inaddition,thewritesofmul- writeofapageextremelyexpensivesinceitincursthe tiplechunkscanbecoalescedtoreducethenumberofwrites costofablockread,writeanderase.Hence,itisimpor- to flash, thereby reducing the fixed cost of accessing flash tanttominimizethenumberofoverwritestoalocation for each write. Such a reverse linked list approach to stor- thathasbeenpreviouslywrittentoonflash. ing files is also employed in a number of other flash-based datastoragesystemsthathavebeendesignedforsensorplat- • Minimizefixedcostofflashwrites: Thisdesigngoalis forms including MicroSearch [31], ELF [4], Capsule [20], shared between the designs of the vocabulary tree and andFlashDB[23]. inverted index. However, in the case of the inverted Insertion: While a log-structured storage optimizes the index, the fixed cost of writes is perhaps more impor- writeoverheadofmaintaininganinvertedindex,itincreases tant than the fixed cost of reads since queries can be the read overhead for each query since accessing the se- expected to be less frequent than data in a sensor net- quence for each visterm involves multiple reads in the file. work. Hence, the number of writes to flash should be Tominimizereadoverhead,weexploittheZipf-iannatureof minimizedtoreducethetotalfixedaccesscosts. theIDFdistributionintwowaysduringinsertionofimages • Exploit visterm frequency: The frequency of words in totheindex. First, weexploittheobservationthatthemost documents typically follows a heavy tailed behavior, frequenttermshaveaverylowcontributiontotheIDFscore, referred to as a Zipf distribution, i.e. the frequency and have least impact on search. Hence, these visual terms of a word is inversely proportional to the rank [3]. In canbeignoredanddonothavetobeupdatedforeachimage. other words, the most frequent word occurs twice as Inoursystem,wedeterminethesenon-discriminatingvisual frequentlyasthesecondmostfrequentword,whichoc- termsduringvocabularytree-constructiontime,sothatthey curstwiceasfrequentlyasthethirdmostfrequentword, donotneedtobeupdatedduringsystemoperation. andsoon. Fromasearchperspective,theleastfrequent Second,weminimizethereadoverheadbyonlyflushing words are the most useful for discriminating between the “longest chains” to flash i.e. the visterms that have the images, and have highest IDF score. Our third goal is longest sequence of image IDs. Due to the Zipf-ian distri- to exploit this search engine characteristic to optimize bution of term frequency, a few chains are extremely long, invertedindexdesign. andbywritingthesetoflash,thenumberofflashreadopera- 5.3 InvertedIndexDesign tionscanbetremendouslyreduced.TheZipf-iandistribution Wediscussthreeaspectsofthedesignoftheinvertedin- alsoreducestheamountofstatethatneedstobemaintained dexinthissection: (a)howtheindexisstoredonflash, (b) for determining the longest chains. We store a small top-k list of the longest chains as shown in Figure 4. This list is the above book example, the user can also issue a continu- updatedopportunistically—wheneveravistermisaccessed ous query and request to be notified whenever the book is thathasalistlongerthanoneoftheentriesinthetop-klist, observedinthenetworkoverthenextweek. itisinsertedintothelist. Whenthesizeoftheinvertedindex In both cases, the proxy which receives the query image inmemoryexceedsthemaximumthreshold,thetop-klistis convertstheimageintoitscorrespondingvisterms. Thevis- usedtodeterminewhichvistermsequencestoflushtoflash. term representation of a query is considerably smaller than Sincethelongersequencesaremorefrequentlyaccessed,the theoriginalimage(approx800bytes),hence,thismakesthe top-klistcontainsthelongestsequenceswithhighprobabil- queryconsiderablysmallertocommunicateoverthesensor ity, hence this technique writes the longest chains to flash, network. Theproxyreliablytransmitsthequeryimagevis- thereby minimizing the number of flash writes. The use of termstotheentirenetworkusingareliablefloodingprotocol. thefrequencydistributionofvistermstodeterminewhichen- 6.2 LocalRankingofSearchResults triestoflushtoflashisakeydistinctionbetweentheinverted Oncethequeryimagevistermsarereceivedateachsen- indexthatweuseandtheonethatisproposedin[31]. sor, the sensors initiate the local search procedure. We first Deletion: Image deletions or aging of images triggers describetheprocessbywhichimagesarescoredandranked, deletion operations on the inverted index. Deletion of ele- andthendescribehowthistechniquecanbeusedforanswer- mentsisexpensivesinceitrequiresre-writingtheentirevis- ingad-hocandcontinuousqueries. termchain.Toavoidthiscost,weusealargeimageIDspace Scoring and Ranking: The local search procedure in- (32bits)suchthatrolloveroftheIDispracticallyimpossible volves two steps: (a) the inverted index is looked up to de- duringthetypicallifetimeofasensornode. termine the set of images to search, and (b) the similarity A key novelty of our approach is the use of the inverted score is computed for each of these images to determine a indexistodeterminewhatimagesshouldbeagedwhenflash rankedlist. LetV bethesetofvistermsinthequeryimage. Q isfilledup. Ideally,imagesthatarelesslikelytocontainob- The first step involved in performing the local search is to jectsofinterestshouldbeagedbeforeimagesthataremore findallimagesinthelocaldatabasethathaveatleastoneof likely to contain objects of interest. We use a combination thevistermsinV . Thissetofimagescanbedeterminedby Q ofvalue-basedandtime-basedagingofimages. The“value” lookinguptheinvertedindexforeachoftheentriesvinV . Q ofanimageiscomputedasthecumulativeidfofallthevis- Let L be the list of image IDs corresponding to visterm v v termsintheimagenormalizedbythenumberofvistermsin intheinvertedindex. AssumethatthefirstvisterminV is Q the image. Images with lower idf scores are more likely to v and the corresponding inverted list of images is L . We 1 v1 bethebackgroundratherthanobjectsofinterest. Thisvalue maintainalistofdocumentsD,whichisinitializedwiththe measurecanbecombinedwiththetimewhenanimagewas listofimagesL . ForeachoftheothervistermsvinV ,the v1 Q capturedtodetermineajointmetricforaging. Forexample, corresponding inverted index L is scanned and any image v imagesthataremorethanaweekold,andhavenormalized notinthedocumentlistDisaddedtoit. idfscorelessthanapre-definedthresholdcanbeselectedfor Oncethesetofimagesisidentified,matchingisdoneby aging. comparingvistermsbetweeneachimageinV(i), wherei∈ D and the visterms in the query image V . If the two im- 6 Distributed Search Q ageshavealargenumberofvistermsincommmontheyare A distributed search capability is essential to be able to likelytobesimilar. Vistermsareweightedbytheiridf. Vis- obtain information that may be distributed over multiple termswhichareverycommonhavealoweridfscore,andare nodes. Inoursystem, thelocalsearchenginesatindividual weightedlessthanuncommonvisterms.Theidfiscomputed sensorsarenetworkedintoasingledistributedsearchengine asdescribedinEquation1. that is responsible for searching and ranking images in re- Toobtainthescoreofanimagei,weadduptheidfscores sponse to a query. In this section, we discuss how global forthevistermsthatmatchbetweenthequeryimage,V ,and Q rankingofimagesisperformedoverasensornetwork. thedatabaseimage,V toobtainthetotalscore.Thisisdiffer- i 6.1 SearchInitiation entthanthemorecommonmethodmentionedinEquation1. A user can initiate a search query by connecting to the proxy,andtransmittingasearchrequest.Twotypesofsearch Score(V,V )= ∑ idf (2) queriesaresupportedbyoursystem: ad-hocsearchandcon- i Q i tinuous search. An ad-hoc search query (also known as a i∈VQandi∈Vi snapshot query) is a one-time query, where the user pro- Any image with a score greater than a fixed pre-defined videsanimageofinterestand, perhapsatimeperiodofin- threshold(Score(V,V )>Th)isconsideredamatch,andis i Q terest, and initiates a search for all images captured during added to the list of query results. The final step is sorting thespecifiedtimeperiodthatmatchtheimage. Forexample, these query results by score to generate a ranked list of re- inthecaseofabookmonitoringsensornetwork,auserwho sults,wherethehigherrankedimageshavegreatersimilarity is missing his or her copy of “TCP Illustrated” may issue tothequeryimage. an ad-hoc query together with a cover of the missing book, Ad-hocvsContinuousQueries:Thelocalsearchproce- andrequestallimagesmatchingthebookdetectedoverthe dure for ad-hoc and continuous queries use the above scor- past few days. A continuous search query is one where the ingandrankingmethodbutdifferinthedatabaseimagesthat networkiscontinuallyprocessingcapturedimagestodecide they consider for the search. An ad-hoc query is processed whetheritmatchesaspecifictypeofentity. Forinstance,in over the set of images that were captured within the time periodofinteresttothequery. Toenabletime-basedquery- ing, an additional index can be maintained that maps each storedimagetoatimestampwhenitwascaptured. Forafew thousandimages,suchatimeindexisasmalltablethatcan bemaintainedinmemory. Oncethelistofimagestomatch is retrieved from the inverted index lookup, the time index islookeduptoprunethelistandonlyconsiderimagesthat were captured during the time period of interest. The rest of the scoring procedure is the same as the mechanism de- scribedabove. Inthecaseofacontinuousquery,thesearch procedure runs on each captured image as opposed to the Figure5. iMote2withEnalabcameraandcustomSDcardboard storedimages. Inthiscase,adirectimage-to-imagecompar- isonisperformedbetweenthevistermsofthecapturedimage 6.4 NetworkRe-Tasking and those of the query image. If the similarity between the visterms exceeds a pre-defined threshold, it is considered a Animportantcomponentofourdistributedinfrastructure positivematch. Ifthenumberofcontinuousqueriesishigh, istheabilitytore-tasksensorsbyloadingnewvocabularies. the cost of direct matching can be further reduced by using Thisabilitytore-taskisimportantfortworeasons. First, it aninvertedfile. enables the sensor proxy to be able to update the vocabu- larytreeattheremotesensorstoreflectimprovementsinthe structure, for instance, when new training images are used. 6.3 GlobalRankingofSearchResults Second, thisallowsthesearchenginetouploadsmallervo- The search results produced by individual sensors are cabulary trees as and when needed on to the resource con- transmitted back to the proxy to enable global scoring of strainedsensors. Oneofthebenefitsofloadingsmallervo- search results. The key problem in global ranking of cabulary trees on-demand is that it is less expensive than search results is re-normalizing the scores across separate searchingthroughalargevocabularytreelocallyateachsen- databases. AsshowninEquation1,thelocalscoringateach sor. Third,whenitisnecessarytousethesensornetworkto sensor depends on the total number of images in the local searchfornewkindsofobjects, anewvocabularytreemay databaseateachsensor. Differentsensorscouldhavediffer- be loaded. For example, assume we had a vocabulary tree entnumberofcapturedimages,andhence,differentlysized to detect certain kinds of animals such as deer but we now local databases. Hence, the scores need to be normalized wanttore-taskittofindtigers,wecaneasilybuildanewvo- beforecomparingthem. cabularytreeattheproxyanddownloadit. Disseminationof To facilitate global ranking, each sensor node transmits thenewvocabularyintoasensornetworkcanbedoneusing the count of the number of images in which each visterm existingre-programmingtoolssuchasDeluge[11]. fromthesetVQoccurs,inadditiontothetotalnumberofim- 7 Implementation ages in the local database. In other words, it transmits the Inoursystem,eachsensornodecomprisesanimote2sen- numerator and denominator of Equation 1 separately to the sor [12], an Enalab camera [1], and a custom SD-card ex- proxy. Note that only the numbers for the visterms which tension board that we designed, as shown in Figure 5. The occur in the query need to be updated not all the visterms. Enalab camera module comprises an OV7649 Omnivision Atypicalqueryimagehasabout200vistermssoweneedto CMOS camera chip, which provides color VGA (640x480) onlysendontheorderof1.6KBfromeachsensornodethat resolution. The imote2 comprises a Marvell PXA271 pro- hasavalidresult. LetSbethesetofsensorsinthenetwork, cessor which runs between 13-416 MHz, and has 32MB and C be the count of the number of images captured by ij SDRAM[12];aChipconCC2420radiochip;andanEnalab sensors thatcontainthevistermv . LetN bethetotalnum- i j i camera. The camera is connected to the Quick Capture In- berofimagesatsensors. Then,theglobalidfofvistermv i terface (CIF) on imote2. To support large image data stor- iscalculatedas: age,an1GBexternalflashmemoryisattachedtoeachsensor node. ∑ N Sensor Implementation: The overall block diagram of i idf = vi∈S (3) the search engine at each sensor is shown in Figure 6. The v ∑C entiresystemisabout5000linesofCcodeexcludingthetwo iv imageprocessinglibrariesthatweusedforSIFTandvocabu- vi∈S larytreeconstruction. Thesystemhasthreemajormodules: Oncethenormalizedidfsarecalculatedforeachvisterm, (a) Image Processing Module (IPM), (b) Query Processing the scores for each image are computed in the same man- Module(QPM),and(c)CommunicationProcessingModule ner as shown in Equation 2. Finally, the globally ranked (CPM).TheIPMcapturesanimageusingthecameraperiod- scoresarepresentedtotheuser.Theusercanrequesteithera ically,readsthecapturedimageandsavesitintoaPGMim- thumbnailofanimageonthelist,orthefullimage. Retriev- agefileonflash. ItthenprocessesthePGMimagefiletoex- ing the thumbnail provides a cheaper option than retrieving tractSIFTfeatures,andbatchesthesefeaturesforabuffered thefullimage,henceitmaybeusedasanintermediatestep lookup of the vocabulary tree.The QPM module processes tovisuallypruneimagesfromthelist. ad-hoc queries and continuous queries from the proxy. For memory,someimagelistsforvistermsarewrittentothelog file. Each of the logged entries is a list of image IDs, and CoQntuineuryous QUERY thefileoffsettothepreviousloggedentrycorrespondingto COPMRMMOOCUDENUSICSLAEINTIGO N LOG INSVEFEAIRLRETCEHD DICLTOIOANDARY QUERY tihsemsaaimnteaivniesdteirnmm. eTmhoeryofifnseotrodferthtoe hacecaedssofthtihsisfllaisnhkecdhaliinst. Ad-hoc Query PROCMEOSDSUINLGE Whentheimagelistcorrespondingtoavistermneedstobe FCFS loadedintomemory,thelinkedlististraversed. UPDATE REGISTER ImageStorageandAging: Eachrawimageisstoredin QUERY QUEUE a separate .pgm file, its SIFT features are stored in a .key UPDATE INVFEIRLETED DICIMTAIOGNEASRY file, and its visterms are stored in a .vis file. A four-byte imageIDisusedtouniquely identifyanimage. Weexpect Capture Image SIFT TVEIRSM- INLOVEGRGTEEDD the number of image that can be captured and stored on a FILE sensor during its lifetime to be considerably less than the VOCA- 232,therefore,wedonotaddresstherolloverprobleminthis BULARY IMAGE CONQMTUAITENCRUHYOUS implementation. Aginginoursystemistriggeredwhenthe PRMOOCDEUSSLEING flashbecomesmorethan75%full. WirelessCommunicationLayer: Thewirelesscommu- nication in our system is based on the TinyOS MAC driver whichisbuilt upontheIEEE802.15.4radio protocol. This Figure6. SystemDiagram driver provides a simple open/close/read/write API for im- plementingwirelesscommunicationandiscompatiblewith theTinyOSprotocols. Sincethereisnoreadilyavailableim- anad-hocquery,itlooksuptheinvertedfileandfindthebest plementationofareliabletransportprotocolfortheiMote2, kmatchesfromthelocallystoredimages. Ifthevalueofkis we implemented areliable transport layer overthe CC2420 notspecifiedbythequery,thedefaultissettofive. Forcon- MAClayer. Theprotocolissimpleandhasafixedwindow tinuousqueries,QPMloadsthevistermsofthecorrespond- (setto2inourimplementation)andend-to-endretransmis- ingdictionaryimagesonceanewtypeofqueryisreceived. sions. Wenotethatthecontributionofourworkisnotare- Wheneveranewimageiscaptured,theIPMgoesthroughall liabletransferprotocol,hencewewerenotfocusedonmax- thedictionaryimagestofindbestkmatches. CPMhandles imizingtheperformanceofthiscomponent. thecommunicationtoothersensorsandtheproxy. 8 Experimental Evaluation SIFT Algorithm: The SIFT implementation we are us- ingisderivedfromSIFT++[27]. Thisimplementationuses In this section, we evaluate the efficiency of our dis- floating point for most calculations, which is exceedingly tributed image search system on a testbed consisting of 6 slow on the iMote2 (10 - 15 seconds to process a QVGA iMote2 sensor nodes and a PC as a central proxy. Each of image) since it does not have a floating point unit. Ko et these imote2 nodes is equipped with a Enalab camera and al [14] present an optimized SIFT implementation which 1GBSDcard,runsarm-Linuxandcommunicateswithother uses fixed point arithmetic and show much faster process- MotesusingtheTOSMACdriver,whichiscompatiblewith ing speed. However, their implementation is specific to the theTinyOSwirelessprotocols. Theresolutionofthecamera TIBlackfinprocessorfamily. Forlackoftime,wehavebeen issettogenerateQuarterVGA(QVGA)images. unabletoportthisimplementationtoournode. We use a dataset consisting of over 600 images for our Vocabulary Tree: Our implementation of the hierarchi- training set. Among them, over three hundred images are calk-meansalgorithmtogeneratethevocabularytreeisde- technical book covers, and the other three hundred images rivedfromthelibpmklibrary[16]. Duetothememorylimi- contain random objects. The book cover images are col- tationoniMote2,wesetthesizeofthevocabularytreetobe lected from a digital camera with VGA format. The other approximately 80K, that is, the branching factor = 16, and images are collected from the Internet. During our experi- depth=5. Bymodifyinglibpmk, wecanshrinkthevocab- ments, the iMote2 captures images periodically, and differ- ulary tree to 3.65MB, but it is still too large to be loaded enttechnicalbookcoverswereheldupinfrontoftheiMote2 completelytomemoryinimote2. AsdescribedinSection4, cameratoformthesetofcapturedimages. wesplittheentirevocabularytreeintoanumberofsmaller 8.1 Micro-benchmarks segments. The root segment contains the first three levels Our first set of microbenchmarks are shown in Table 1, of the tree, and for each leaf node of the root subtree, i.e., where we measure the power consumption of the PXA271 162=256inourcase,thereisasegmentcontainingthelast processor, the CC2420 radio, the SD card extension board, two levels of the tree. After splitting, each chunk is of size and the Enalab camera on our iMote2 platform. Since the 14KB,whichissmallenoughforimote2toreadintomem- data ratesof differentcomponents vary, we alsolist theen- ory. ergy consumption for some of them. The results show that InvertedIndex: Theinvertedindexisimplementedasa the processor consumes significant power on our platform, singlefileonflash,whichisusedasanappend-onlylogfile. hence it is important to optimize the amount of processing. Theinvertedindexismaintainedinmemoryaslongassuf- Another key observation is that the difference between the ficientmemoryisavailable,butonceitexceedstheavailable energy costs of communication and storage is significant. Table1. PowerandEnergybreakdown Component State Power(mW) Per-byteEnergy s) 100 ww/o/ BBaattcchhiinngg PXA271Processor Active 192.3 m( Idle 137.0 ster CC2420radio Active 214.9 46.39µJ e vi SDFlash Read 11.2 5.32nJ ut Write 40.3 7.65nJ mp 10 o OVcamera Active 40.0 o c e t m Table2. Imageprocessingbreakdown Ti Operation Time(s) Energy(J) 1 1 2 3 4 5 6 7 8 9 10 Sensing(ImageCapture) 0.7 0.14 Number of batched images Sift(floatingpt) 12.7 2.44 Figure7. Impactofbatchingonlookuptime. Sift(fixedpt)[14] 2.5 0.48 CompresstoJPEG(ratio0.33) 1.2 0.23 CompresstoGZIP(ratio0.61) 0.7 0.14 queries. 8.2 EvaluationofVocabularyTree ThisisbecausetheSDcardconsumesanorderofmagnitude Inthissection,weevaluatetheperformanceofourvocab- less power than the CC2420 radio, and has a significantly ularytreeimplementationandshowthebenefitsofpartition- higher data rate. The effective data rate of the CC2420 ra- ingandbatchingtooptimizelookupoverhead. dio is roughly 46.6Kbps, whereas the data rate of SD card 8.2.1 ImpactofBatchedLookup is12.5MBps. Asaresult,storingabyteisroughlyfouror- We evaluate the impact of batched lookup on a vocabu- dersofmagnitudecheaperthantransmittingabyte,thereby lary tree with 64K visterms, which has a depth of 4 and a validatingourextensiveuseoflocalstorage. branching factor of 16. The vocabulary tree is partitioned Table2benchmarkstherunningtimeofthemajorimage into257chunksincludingonerootsubtreeand256second- processingtasksinoursystem,includingsensing,SIFTcom- level subtrees. Each of these chunks is around 20KB. We putation,andImagecompression. Allofthesetasksinvolve varythebatchsizefromasingleimage,whichisabatchof theprocessorinactivemode. WefindthattheSIFTfeature roughly200SIFTfeatures,totenimages,i.e. roughly2000 extraction from a QVGA image using the SIFT++ library SIFTfeatures. [27] is prohibitively time consuming (roughly 12 seconds). Figure7demonstratesthebenefitofbatchedlookup.Note Asignificantfractionofthisoverheadisbecauseoffloating that the y-axis is in log scale and shows the time to lookup point operations performed by the library, which consumed visterms for each image. The upper line shows the refer- excessive time on a PXA271 since the processor does not encetimeconsumedforlookupwhennobatchingisused,i.e. haveafloatingpointunit.Thisproblemissolvedinthefixed wheneachSIFTfeatureislookedupindependently,andthe point implementation of SIFT described by Ko et al [14], lowerplotshowstheeffectofbatchingonper-imagelookup who report a run time of roughly 2-2.5 seconds (shown in timeasthenumberofbatchedimagesincreases. Ascanbe Table 2). Since the inflated SIFT processing time that we seen, even batching the SIFT features of a single image re- measureisanartifactoftheimplementation,weusethefixed duces the lookup time by an order of magnitude. Further pointSIFTrun-timenumbersfortherestofthispaper. The batching reduces the lookup time even more, and when the tablealsoshowsthethetimerequiredforlossyandlossless batchsizeincreasesfrom1to10images,thetimetolookup compressionontheiMote2,whichweperformbeforetrans- animagereducesfrom2.8secsto1.8secs. Thisshowsthat mittinganimage. batchedlookups,andtheuseofpartitionedvocabularytrees Table 3 reports the memory usage of the major com- hasconsiderableperformancebenefit. ponents in our system. The arm-linux running on imote2 8.2.2 ImpactofVocabularySize takesabouthalfofthetotalmemory,andtheimagecapture Thesizeofthevocabularytreehasasignificantimpacton andSIFTprocessingcomponentstakeaboutaquarterofthe boththeaccuracyaswellastheperformanceofsearch.From memory. Asaresultofouroptimizations,theinvertedindex anaccuracyperspective, thegreaterthenumberofvisterms andvocabularytreearehighlycompactandconsumeonlya in a vocabulary tree, the better is the ability of the search few hundred kilobytes of memory. The dictionary file cor- enginetodistinguishbetweendifferentfeatures. Fromaper- responds to the images in memory for handling continuous formanceperspective,largervocabulariesmeangreatertime andenergyconsumedforbothlookupfromflash,aswellas for dynamic reprogramming. To demonstrate this tradeoff Table3. Breakdownofmemoryusage between accuracy and performance, we evaluate three vo- Task Memory(MB) cabulary trees of different sizes constructed from our train- ArmLinux 14.3 ingset. Theaccuracyandlookupnumbersareaveragedover ImageCapture 1.9 twentysearchqueries. SIFT 7.6 Table 4 reports the impact of vocabulary size on three InvertedIndex 0.75 VocabularyTree 0.2 metrics: thelookuptimeper-imageforconvertingSIFTfea- DictionaryFile 0.1 ture to visterms, the search accuracy, i.e the fraction of the ProcessingModules 0.7 queriesforwhichthetop-mostmatchwasaccurate,andthe

Description:
of a distributed search engine for wireless sensor networks, and showed that such a design is energy-efficient, and ac-curate. Our key contributions were five-fold.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.