1 Towards Distributed Service Discovery in Pervasive Computing Environments Dipanjan Chakraborty, Anupam Joshi, Yelena Yesha, Tim Finin Department of Computer Science and Electrical Engineering, University of Maryland Baltimore County, Baltimore, MD 21250 Email: fdchakr1, joshi, yeyesha, (cid:2)[email protected] Abstract The paper proposes a novel distributed service discovery protocol for pervasive environments. The protocol is based on the concepts of peer-to-peer caching of service advertisements and group-based intelligent forwarding of service requests. It does not require a service to be registered with a registry or lookup server. Services are describedusingthe Web Ontology Language(OWL).We exploitthe semantic class/subClass hierarchyof OWL to describeservicegroupsandusethissemanticinformationtoselectivelyforwardservicerequests.OWL-basedservice description also enables increased (cid:3)exibility in service matching. We present simulation results that show that our protocolachievesincreasedef(cid:2)ciencyindiscoveringservices(comparedtotraditionalbroadcast-basedmechanisms) byef(cid:2)cientlyutilizingbandwidthviacontrolledforwardingofservicerequests. keywords: Service DiscoveryArchitecture, PervasiveComputing,MANET,OWL,Semantic De- scription,Peer-to-Peer, Advertisements This work was supported in part by NSF Awards IIS9875433, IIS0209001, CCR 0070802, ACI 0203958, and the DARPA DAMLprogramundercontractF30602-97-1-0215 Report Documentation Page Form Approved OMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number. 1. REPORT DATE 3. DATES COVERED 2005 2. REPORT TYPE - 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Towards Distributed Service Discovery in Pervasive Computing 5b. GRANT NUMBER Environments 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION Defense Advanced Research Projects Agency,3701 North Fairfax REPORT NUMBER Drive,Arlington,VA,22203-1714 9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSOR/MONITOR’S ACRONYM(S) 11. SPONSOR/MONITOR’S REPORT NUMBER(S) 12. DISTRIBUTION/AVAILABILITY STATEMENT Approved for public release; distribution unlimited 13. SUPPLEMENTARY NOTES 14. ABSTRACT see report 15. SUBJECT TERMS 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF 18. NUMBER 19a. NAME OF ABSTRACT OF PAGES RESPONSIBLE PERSON a. REPORT b. ABSTRACT c. THIS PAGE 35 unclassified unclassified unclassified Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 2 I. INTRODUCTION AND MOTIVATION Service discovery is a well-recognized challenge in distributed environments [40], [9], [14], [24],[27],[31]. Withthedecreasingcostandformfactorofcomputingdevices,theincreaseinthe informationbeingkeptonthesedevices,andtheincreasingprevalenceofshortrangead-hocwire- lessnetworks,servicediscoverywillplayanimportantroleinPervasiveComputingenvironments. Pervasive Computing environments are comprised of handheld, wearable and embedded comput- ers, besides regular desktop clients and servers. These are connected by some combination of wirelessad-hocnetworksandwirelessinfrastructurebasednetworkssuchasWLANs. Insuchen- vironments,thecohortofcomputingelementsparticipatinginanydistributedsystemdynamically changes with time. In other words, a user (her computing device(s) to be precise), spontaneously networkswithdifferentdevicesassheandotheruserschangelocationsoveraperiodoftime. This is not to say that all elements in this distributed scenario must be mobile (cid:150) only that no particular set of devices/computers is available to form the stable core of a distributed system at all times. For instance, in environments such as shopping malls, conference venues or smart-of(cid:2)ces, some devices (e.g. Desktops/Laptops, IP phones, Point of Sale terminals, Projectors, Coffee machines) are static while other devices (Cell phones, Handhelds etc) are mobile. In the extreme case, Per- vasiveComputingenvironmentsincludeMANETs(MobileAdhocNetworks)whereallnodesare mobile and dynamically change their locations. Examples of such environments can be found in the mobile devices used by emergency response services, by soldiers in battle(cid:2)elds, by people walkingonstreetsetc. We envisage that in the near future, static, mobile, and embedded devices will provide cus- tomized information, services and computation platforms to peers in their vicinity. The primary goalofapplicationsforpervasivecomputingenvironmentsistoperformthetaskgivenbytheuser byexploitingtheresourcesorservicesthatarepresentintheneighborhood. Somerequestsneeda 3 singleservice, which isdirectlyavailablein the vicinity,whereas someother requestsneed multi- ple services or informationsources to be integrated to obtainthe desired result. In either case, we needa(cid:3)exibleservicediscoveryinfrastructurethatistailoredtowardspervasiveenvironments[10]. Ofcourse,thereareissuesrelatedtosecurityandprivacyinsuchenvironments. Othercolleagues in our group are building distributed trust and belief based systems for security and privacy in pervasiveenvironments[2],[28],[44]. Thereisalsoaquestionofpaymentsforservicesofferedin thisenvironment. Thisisoutsidethescopeofourpresentwork,butisbeingactivelyresearchedin them-commerceandeconomicsdomains. There has been considerable academic and industrial research effort in service discovery in the context of wired as well as partly wired/wireless networked services. Two important aspects of service discovery are the discovery architecture and the service matching mechanism. Protocols like Jini [1], Salutation and Salutation-lite [40], UPnP [25], UDDI [45], Service Location Pro- tocol [22] have been developed to facilitate applications to discover remote services residing on stable networked machines in the wired network. Some of these protocols (e.g. UPnP) can also beusedbymobiledevicestodiscovernetworkedservicesusingwirelessnetworkingtechnologies like 802.11 (a, b or g). The general architecture of these protocols is as follows: a service adver- tises and registers itself to a service register that keeps track of networked services. Services can de-register at any point of time. Most of the communication happens over IP-type networks, and the discoveryprotocol relies on multicastsand broadcasts for importantfunctionssuchas the dis- covery of the registry. In summary,these architectures are primarily centralized/semi-centralized, registration-oriented and have an implicit assumptionthat the underlying network is stable and is capable of providing reliable communication. Clearly, service discovery in pervasive computing environments requires a decentralized design approach where a node should not depend on some other node(s) to advertise/register services. Each service should be autonomous and be able to 4 advertiseitspresence. Moreover,thediscoveryshouldalsoadaptitselftore(cid:3)ectthechangesinthe vicinity. Adiscoveryprotocolshouldbeabletoutilizetheunderlyingnetworkef(cid:2)ciently. Existing service matching techniques in the above-mentioned protocols use simple matching schemas. Theyuseinterfacedescriptions(e.g. Jini)orattributes[1],[40]orevenunique-identi(cid:2)ers (BluetoothSDP[5]). Servicematchingisdoneatasyntacticlevel. However,syntacticlevelmatch- inganddiscoveryisinef(cid:2)cientforpervasiveenvironmentsduetotheautonomyofserviceproviders andtheresultingheterogeneityoftheirimplementationsandinterfaces. Forexample,wecanhave the same service implement different interfaces which could result in the failure of a syntactic match if the service query does not match with any interface. To alleviate this problem, there has been considerable work to develop languages [29], [6], [20] to express service requirements and facilitate(cid:3)exiblesemantic-levelservicediscovery[11],[49],[18]. Service discovery architectures [23], [3], [2] developedspeci(cid:2)cally for pervasiveenvironments are either request broadcast based or advertisement-based. In a broadcast-based 1 solution, a ser- vice discovery request is broadcast through out the network. If a node contains the service, it responds with a service reply. The protocol, under ideal conditions of a fully-connected network withoutmessagelosses,offershighreliabilityindiscoveringaservice. However,itsuffersfromthe following disadvantages. First, global broadcast scales poorly with increasing network diameter andnetworksize. Second,itutilizesresourcesandcomputationpoweronallnodesofthenetwork including nodes that do not even have the service or nodes that may not even fall in the route to the desired service. This extra processing is essentially redundant. Third, it utilizes signi(cid:2)cant network bandwidth (since the request traverses to all nodes through all paths possible) and hence createsa largeloadonthenetwork. The other solution is for the services to advertise themselves to all the nodes. Each node in- 1 Broadcast-basedprotocolisalsoreferredtoasRequest-broadcastbasedprotocolinsomepartsofthepaper 5 terested in discovering services cache the advertisements. The advertisements are matched with service requestsand aresultisreturned. In thissolution,thecache sizeincreases withthe number ofservices. Manyofthenodeshavelimitedmemoryandareunabletostorealltheadvertisements. Soonthecache gets(cid:2)lledup. Thisisalsoinef(cid:2)cientintermsofbandwidthusage,sincethewhole network has to be periodically (cid:3)ooded with advertisements. There are solutions that offers both advertisementsandbroadcastofrequests,butneverthelessdonotaddresstheproblemsofnetwork load,network-widereachabilityandscalability. Existingsolutionshavemostlyconsideredtheservicematchingandthediscoveryarchitectureas two decoupled (cid:2)elds. This paper introduces a novel approach (dubbed Group-based Service Dis- covery or GSD) that combines the two by utilizing semantic service descriptions used in service matching to develop an ef(cid:2)cient, distributed, scalable and adaptive service discovery architecture for pervasive computing environments. Our architecture is based on the concept of peer-to-peer cachingofservicedescriptions,boundedadvertisingof servicesinthevicinityandef(cid:2)cientselec- tiveforwardingofservicediscoveryrequestsusingfunctionalgroupinformationbeingpropagated with service advertisements. Functional grouping of services enables our architecture to encom- pass a broad range of discovery techniques ranging from simple broadcast to directed unicast, thus making it highly adaptable to the requirements of the network. Our solution exploits the semantic capabilities offered by the Web Ontology Language (OWL)[20] to effectively describe services/resources present on nodes in the ad hoc environment. Furthermore, the services present on the nodes are classi(cid:2)ed into several groups based on the class-subclass hierarchy present in OWL. A service thusbelongs to a hierarchy of groups startingfrom the parent group called (cid:147)Ser- vice(cid:148). This group information is used to selectively forward a service request to other devices where there are greater chances of the service beingdiscovered. Semantic grouping of services is not uncommon in the service matching research and has been used to enable functionally similar 6 or (cid:147)near(cid:148) matches [11], [33]. We use the information to enable semantic matching and build a highlyintegratedyetdistributedandanef(cid:2)cientdiscoveryinfrastructure. We have implemented GSD and extensively compared its worst case and average case perfor- mance with traditional broadcast-based solution for service discovery. We provide results com- paringGSDandbroadcast-basedservicediscoverywithrespecttoaverageresponsetime,average response hops, discovery ef(cid:2)ciency, average network load and several other parameters. Our re- sultsshowthatGSDscalesverywellwithrespecttoincreasingnetworkandincreasingrequestload on the system. Our experiments also show that discovery ef(cid:2)ciency of GSD is almost as good as discoveryef(cid:2)ciencyofbroadcast-basedsolutionsandinfactperformsbetterthanbroadcast-based solutionswithrespecttootherparameterslikeresponsetimeandnetworkload. WewillusethetermMANET(MobileAdhocNetwork)andPervasiveComputingEnvironment interchangeablyintherestof thepaper. MANETrepresentsthe extremeof thepervasivecomput- ing spectrum. Our system is designed to handle this extreme case and our simulations are done on a MANET. The remaining part of the paper is organized as follows: In section II we provide a brief description of the ontology and the functional grouping of services. Section III describes our protocol in detail. Section IV describes the various salient features of our protocol. Section V presents our experimentalresults. We surveyother related works in section VI and conclude in sectionVII. II. GROUP-BASED SEMANTIC SERVICE DESCRIPTION Wehavechosen OWLtode(cid:2)ne an ontologytodescribe services/resourcesina MANET.There are couple of reasons for choosing an ontology-based approach to describe services. (1) The se- mantics of OWL can be used to describe services in different nodes and also to enable semantic matchingsupportwiththoseservicedescriptions. Anyresourceorserviceisdescribedintermsof 7 classes and properties. In addition, OWL provides rules for describing further constraints and re- lationshipsamongresourcesincludingcardinality,domainandrange restrictionsaswellasunion, disjunction, inverse and transitivity. These axioms can be easily exploited to create an ontology describingservicesandservicegroups. (2)OWL,whichisbasedoneXtensibleMarkupLanguage (XML) and Resource Description Framework [29] is also being used as a standard to describe information/service on the wired infrastructure and the Web. This makes our service description interoperablewithothersemanticwebinfrastructures. Service(cid:13) Hardware(cid:13) Software(cid:13) Input/(cid:13) Require-(cid:13) Computation(cid:13) No-Input(cid:13) Output(cid:13) Input(cid:13) Printer(cid:13) InkJet(cid:13) Laser(cid:13) Fig.1. HierarchicalGroupingofServices We have leveraged our prior work in development of the DReggie Ontology [11] that con- tainsacomprehensiveontologyfordescribingservicesintermsofitscapabilities,inputs,outputs, platform constraints, and device capabilities of the device on which it is residing etc. Using the class/subClassOf axiom of OWL, we have incorporated a preliminary grouping of different pos- sible services in a MANET primarily based on service functionality. A signi(cid:2)cant advantage of our discovery architecture is that the ontology is extensible and one can modify it without alter- ing the discovery mechanism. The discovery mechanism would take into account the modi(cid:2)ca- tion. Due to space restrictions, we are unable to provide the ontology. However, it is available at http://daml.umbc.edu/ontologies/dreggie-ont.owl. The generic class Ser- 8 vice is functionally classi(cid:2)ed into two main sub-groups: Hardware and Software Service. Each sub-group is further classi(cid:2)ed in this manner till we reach a very speci(cid:2)c service. For exam- ple, a color printer service may be classi(cid:2)ed under Service/ Hardware/Input-output-type-Service/ Printer-Service. Figure1showsthefunctionalhierarchy. III. SERVICE DISCOVERY PROTOCOL Ourprotocol(GSD)isbasedontheconceptsof(1)Boundedadvertisingofservicesinthevicin- ity (2) Peer-to-Peer dynamic caching of service advertisements (3) Service group-based selective forwarding of discovery requests. Our protocol also has multiple user-controlled parameters that determinetheextentofboundsforadvertising,servicecachinganddiscoveryrequestpropagation. Inthissectionwedescribethesekeyaspectsofourprotocolindetail. A. ServiceAdvertisementsandPeer-to-PeerCaching Each Service Provider (SP) periodically advertises a list of its services to all the nodes in its radiorange. Anadvertisementmessageconsistsofthefollowing(cid:2)elds: <Packet-type,Source-Address,Service-Description,Service-Groups,Other-Groups,Hop-Count, Lifetime,ADV DIAMETER> Amonotonicallyincreasingidenti(cid:2)ercalledbroadcast-idalongwiththesource-addressuniquely identi(cid:2)es a broadcast and detects duplicate advertisements. Please note that this identi(cid:2)er is dif- ferentfromsourcesequencenumbersmaintainedbynodesintraditionalad-hocroutingliterature. Sequence numbers refer to a single message identi(cid:2)er whereas broadcast-id refers to a broadcast event that may generate multiple messages. The Service-description and Service-groups contain informationaboutthelocalservice(s)andtheircorrespondingservicegroups. 9 Additionally,eachnodereceivingtheadvertisementcanforwardittoallothernodesinitsradio range. The (cid:2)eld ADV DIAMETER determines the number of hops each advertisement travels. Each node increments the Hop-Count when it forwards an advertisement that is in turn used to computewhethertheadvertisementcanbeforwardedanyfurther. Figure2showsthepseudocode forsendingadvertisements. Function SendAdvertisement(..) :- 1. After each ADV_TIME_INTERVAL period do { 2. Initialize Adv_Message; 3. Adv_Message[Service-Description]=GetLocal_ServiceInfo(Service_Cache); 4. Adv_Message[Service-Groups]=GetLocal_ServiceGroupInfo(Service_Cache); 5. Adv_Message[Other-Groups]=GetVicinity_GroupInfo(Service_Cache); 6. Adv_Message[Hop-Count]=0; 7. Adv_Message[Lifetime]=ADV_LIFE_TIME; 8. Adv_Message[Adv_Diameter]=ADV_DIAMETER; 9. Transmit Advertisement to all nodes in the radio range; 10. } Fig.2. PseudoCodeoftheProcessofAdvertisingServicesintheVicinity EachnodeonreceiptofanadvertisementstoresitinitsServiceCache. EachentryintheService Cache containsthefollowing(cid:2)elds: <Source-Address,Local,Service-Description,Service-Groups,Other-Groups,Lifetime> Apart from storing advertisements, a Service Cache also stores descriptions of local services in the node(identi(cid:2)ed by the local (cid:2)eld in each cache entry). The (cid:2)eld Other-Groups contain a list of the groups that the corresponding Source-Address (sender of the advertisement) has seen in its vicinity. We follow a least-remaining-lifetime replacement policy to replace entries when the cacheisfull. However,weare awareofworkinpredictivecachemodeling[13]andpro(cid:2)le-driven caching [35], [15] that can be used in our architecture to model the cache replacement strategy. However, since cache replacement policies are not the focus of this paper, we chose a simple uniformcache replacementstrategyfor alltheprotocols. Figure3 displaysthepseudocodeofthe peer-to-peer cachingandadvertisementforwardingprocess.