ebook img

NASA Technical Reports Server (NTRS) 20020011679: An XML-Based Protocol for Distributed Event Services PDF

7 Pages·0.46 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview NASA Technical Reports Server (NTRS) 20020011679: An XML-Based Protocol for Distributed Event Services

An XML-Based Protocol for Distributed Event Services Warren Smith Dan Gunter NASA Ames Research Center Lawrence Berkeley National Laboratory Moffett Field, CA 94035 Berkeley, CA 94720 Darcy Quesnel Argonne National Laboratory Argonne, IL60439 Abstract systems that typically consist of high- A recent trend in distributed computing is the performance compute, storage, and networking construction of high-performance distributed systems resources. Examples of such computational grids called computational grids. One difficulty we have are the DOE Science Grid [3], the NASA encountered is that there is no standard format for Information Power Grid [8, 18], and the NSF the representation of performance information and Partnerships for Advanced Computing no standard protocol for transmitting this Infrastructure [9, 10]. The major effort to deploy information. This limits the types of performance these grids is in the area of developing the analysis that can be undertaken in complex software services to allow users to execute distributed systems. To address this problen_ we present an XML-based protocol for transmitting applications on these large and diverse sets of performance events in distributed systems and resources. These services include security, evaluate theperformance of thisprotocol. execution of remote applications, managing remote data, access to information about Keywords: event service, XML, distributed resources and services, and so on. There are computing, computational grids. several toolkits for providing these services such as Globus [4, 13], Legion [7, 15], and Condor 1 Introduction [1, 191. There are many different projects from As part of these efforts to develop government, academia, and industry that provide computational grids, the Global Grid Forum [5] services for delivering events in distributed is working to specify general protocols and APIs environments. The problem with these event to be used by various grid services. These services is that they are not general enough to specifications will allow interoperability support all uses and they speak different between the client and server software of the protocols so that they cannot interoperate. We toolkits that are providing the grid services. The require such interoperability when we, for goal of the Performance Working Group [6] of example, wish to analyze the performance of an the Grid Forum is to codify best practices and application in a distributed environment. Such promote interoperability for the storage and an analysis might require performance distribution of performance data. The resulting information from the application, computer specifications must support tasks such as systems, networks, and scientific instruments. In profiling parallel applications, monitoring the this work we propose and evaluate an XML- status of computers and networks, and based protocol for the transmission of events in monitoring the performance of services provided distributed systems. by a computational grid. One recent trend in government and This paper provides an overview of a academic research is the development and proposed protocol and data representation for deployment of computational grids [14]. the exchange of events in a distributed system. Computational grids are large-scale distributed ,the protocol exchanges messages formatted in • A definition of events and information _vertise available related to events, • The protocol for communicating between producers and consumers of events, and • A definition of the structure and organization of the data in the directory service. CO_UQ___r °fevents In this paper we describe a proposed Figure 1. Grid Monitoring Architecture. producer-consumer communication protocol and the event information that is required by this XML and it can be layered atop any low-level protocol. Our protocol consists of a XML communication protocol such as TCP or UDP. encoding of messages and the state machines Further, we discuss Java and C++ that describe when these messages are sent. We implementations of this protocol and their do not specify the transport protocol on top of performance. which our protocol will be layered. The The next section will provide some further transport protocol could be UDP, TCP, tt'ITP, background information. Section 3 describes SSL, or any number of other protocols. We how we represent events and related information choose to use XML to represent our data for using XML. Section 4 describes our protocol several reasons. First, XML provides a textual and Section 5 discusses the performance of two representation of data that is readable and implementations of the protocol. therefore easier to debug. We could have 2 Background selected a binary representation of our data for The Grid Forum Performance Working improved performance, but a textual approach seems more appropriate at our current Group has defined the basic architecture shown experimental stage. Second, XML is self- in Figure 1. This architecture consists of three describing and hierarchical, which makes it easy components: a producer, a consumer, and a to represent structured event data. Third, XML directory service. A producer is something that was selected instead of any other textual is producing performance data, each unit of representation because of the large and growing which is called an event. This producer can be number of XML tools available and the growing an application profiler, a host monitor, or number of people familiar with XML. anything else. A consumer is something that Another approach we could have taken was consumes or receives events. A consumer might to use SOAP [12] or XML-RPC [11] and thus be a tool to calculate how much time is spent in avoid explicit representation of the XML for each function of an application or a graphical each message. While this approach is a valid interface showing the status of a set of hosts. A one, it has several drawbacks. First, neither directory service is a database that is used to SOAP nor XML-RPC has low-level transport store and retrieve information about producers bindings: SOAP has HTTP and SNMP bindings, and consumers. It is accessed using a protocol and XML-RPC has only an HTTP binding. such as the Lightweight Directory Access These bindings are not suitable for all of the Protocol (LDAP) [17]. A host monitor may situations we wish to address. Second, for advertise itself in the directory service so that a SOAP, there is a lack of fully-featured consumer can search the directory service and implementations in the languages we are find the monitor for a certain host. The interested in and/or licensing restrictions on the consumer can than contact that producer in order available implementations. We will continue to to receive events about that host. track these XML-based protocols and may adopt The Grid Forum Performance Working them in the future. Group is defining the protocols and data Another approach would have been to use representations required by this architecture. (_ORBA [20], for instance the CORBA Event This includes: Service. This approach, while also valid, would impose a significantadministrativeand • Loadl. The 1minute CPU load reported development overhead if the target community by uptime. did not already use CORBA, as is the case with • Load5. The 5 minute CPU load reported the academic and scientific communities. SNMP by uptime. [21] was also considered, but was not used due • Loadl5. The 15 minute CPU load to its inability to handle streaming data reported by uptime. efficiently. • HostName. The name of the host the load measurement is made on. 3 Events and Event Parameters Here is an example of such an event in our Before we describe our protocol, we first XML encoding: describe how we use XML to represent events <Up timeC PULo ad and event parameters. In this section and the _Ins='http: //www. gridforum, org/Per for mance/Events •> following sections we provide example XML <Loadl>l. 5</Loadl> representations of the data we are representing. <Load5>l. 6</Load5> As mentioned before, events are the basic <Loadl5>l. 3</Loadl 5> unit of information in our architecture. An event <Hos tName> foo. gov</Hos tName> is a named set of <name, value> pairs where the <TimeS tamp>2000- II- 09T21 :51:45Z</TimeSt amp> values are typed and there is always a pair that </Up timeCPULoad> contains the time the event was generated. We When asking for a CPU load event, the represent this time using a time stamp that is a following input parameters can be specified: string formatted according to the proposed Grid • Period. The amount of time between Forum standard format [16] This format is an each uptime event generation. This extension of the ISO 8601 time format [2]. Each parameter is only used when a element will also have two optional attributes: subscription is ,performed. If this units and accuracy. The units attribute indicates parameter is specl_fied for a query, it is the units associated with the element's value ignored. (e.g.: 'degrees', or 'bytes') and the accuracy An example of how to specify this parameter attribute indicates what range of likely "real" is: values are represented by the element's value <Upt imeCPULoad (e.g. '+/-5.0'). xmlns= "http ://www. gr idforum, org/Perf Associated with each event is a set of ormanc e/EventParamet ers" > <Period units="min" >I0</Perlod> parameters that describe the information that can </Upt imeCPULoad> be passed to a producer of events as part of a 4 Protocol subscription or query. The event parameters consist of a set of <name, value> pairs. Each This section describes the XML protocol we element can have a units attribute associated use for communication between producers and with it. An examples of an event and its consumers. Due to space limitations, we do not parameters are shown in Section 3.1. provide the XML schema or state machines for our protocol. Our protocol supports three major 3.1 CPU Load classes of interactions between producers and The CPU load event is a simple event for consumers containing the load information returned by the In the first interaction, a consumer subscribes Unix uptime command. We therefore use the to specific events from the producer and the event name "UptimeCPULoadEvent" for this producer sends these events to the consumer. event to differentiate it from other means of These events are sent out over a period of time measuring CPU load. This event must contain until the producer or consumer ends the the following elements: subscription. We call this interaction a • TimeStamp. The time at which the CPU consumer-initiated subscription. load event was generated. The second type of interaction is the producer-initiated subscription. First, the producecrontactsa consumetro requesta contains the events defined by the Grid Forum subscriptionT.heneventsaresentfrom the Performance Working Group, the name space producer to the consumer until the subscription http://www.gridforum.org/PerformanceZEventPa is terminated. This type of interaction is useful, rameters contains the parameters defined by the for example, when a producer sends events to an working group that can be specified when asking archive. In this case, the archive is the consumer• for an event or events, and the name space The third type of interaction is a simple http://www.gridforum, org/Perforrnance/Protoco request(cid:0)reply. In this case, a consumer requests l contains the elements which make up the information from a producer and the producer messages of our protocol. Further, we allow any replies with the information. Our two previous group to define events and event parameters in interactions include request/reply interactions their own name spaces for use with our protocol. but our protocol includes two instances of this interaction that stand on their own. First, there is 4.2 Consumer-Initiated Subscription a query interaction• In this interaction the When a consumer wants to receive a stream consumer queries a producer for a single event of events from a producer, it subscribes to the and the producer replies with the event. Second, producer for the events. After a subscription there is an available events interaction where a successfully takes place, events are sent from the consumer requests a list of the events available producer to the consumer until either the from a producer and the producer replies with consumer or producer unsubscribes• There are the list. five messages in this process, described in the following sections. 4.1 General Message Format In general, each message consists of: 4.2.1. Subscribe Request 1. The number of bytes in the message. For our The subscribe reque_.t message initiates a TCP binding, this is a 32-bit integer in subscription and consists of: network byte order. • A consumer-unique request ID 2. The XML tags that indicate the message (required). type. • A consumer-unique subscription ID 3. Request messages always have a requester- (required). unique request ID chosen by the requestor. • Event parameters element (required). This request ID is an attribute of the • Any input parameters needed to message tag generate events (optional). 4. Reply messages always have a request [19, Here is an example of a subscribe request which matches the request I'D of the request message: that is being replied to. <SubscribeRequest 5. Reply messages always have a return code xmlns= "http ://www. gridforum, org/Per for and may have a detailed return message. The mance/Protocol" requestID=" i"> Return element indicates if an operation was <Subsc ript ionID> 12</Subscript ionID> <Up timeC PULo ad successful (Success) or a failure (Failure). xmlns ="http ://www. grid forum, org/Perfor These return codes will most likely be mance/EventParameters" > expanded later to contain more detailed error <Period unit s=" sec" >600< /Period> codes. The ReturnDetail element contains a </Up timeCPULoad> </Subscr ibeRequest > text message that contains detailed user- readable information about the status of a In the future, we will add an optional event request. filter to subscription request messages. The filter 6. The message-specific data inside the XML specifies which events in the stream of events tags that identify the message• should be sent on to the consumer. For example, We define three XML name spaces for use in a filter may indicate that only CPU load events our protocol. The name space with a 5-minute load average greater than or http://www, gridforum, org/Performance/Events equal to 5.0 should be sent to the consumer. 4.2.2. Subscribe Reply • The requestlD (required) of the request The subscribe reply message is sent in that this message is in reply to. response to a subscribe request and consists of: • Return (required). • The requestlD (required) of the request • ReturnDetail (optional). that this message is in reply to. An examples of a unsubscribe reply message is: • Return (required). Success means the request was successfully completed, <UnsubscribeReply xmlns= "http ://www. gr idforum, org/Perfor Failure means the request failed. Other mance/Protocol" requestID=" 9"> return codes to represent more detailed <Return> Succes s</Return> failures will most likely be added in the </UnsubscribeReply> future. 4.2.5. Event • ReturnDetail (optional). Text giving further information about the successful An event message is sent from the producer or unsuccessful subscribe. to the consumer after a subscription is initiated. • An optional producer-unique An event message consists of: SubscriptionlD that identifies the • The subscription ID (required) that was subscription that was successfully made generated by the consumer. by the consumer (if one was). The • The event (optional) in the format subscription ID should be present if the described in Section 3. The event should subscription was successful and should be present if an error is not reported. not be present if the subscription was • Error (optional), indicating that an error not successful. occurred while generating the event. An example of a subscribe reply message is: • ErrorDetail (optional) which provides <SubscribeReply further informatioo about the error that )unlns=ht tp ://www. gridforum, org/Per form occurred while generating the event. ance /Pro toco 1" request ID=" 1"> <Return>Success</Return> This element should only occur in <Subscript ionI D> 99</Subs cript ionID> conjunction with the Error element. </SubscribeRep ly> Example event messages are shown below. <Event 4.2.3. Unsubscribe Request xmlns= "http ://www. grid forum, org/Per for mance/Protocol" subscriptionID= _1234" > Unsubscribe requests can originate at either <Up timeC PULo ad the producer or consumer. In either case, the xmlns ="http ://www. gr idforum, org/Per for message has the same format. The unsubscribe mance/Protocol "> request message consists of: <Loadl>l. 5</Loadl> <Load5>l. 6</Load5> • A sender-unique requestID (required). <Loadl 5>1.3 </Loadl 5> • The SubscriptionlD (required) generated <TimeStamp>2000- II- by the message target (i.e. producer if 09T21 :51:45Z</TimeStamp> the sender is the consumer, consumer if </Up timeCPULoad> </Event> the sender is the producer) that identifies the subscription that isbeing terminated. 4.3 Producer-Initiated Subscription An example of an unsubscribe request message is: There are cases where a producer of events <UnsubscribeReques t may want to initiate a subscription. A common x_Ins ="http ://www. grid forum, org/Per for case is when a producer wants to archive the mance/Protocol" reques tID=" 9"> events it is generating. The request and reply <Subscr iptionID>l 234< /Subscript ionID> messages used during a producer-initiated </UnsubscribeRequest> subscription are identical to those used for a 4.2.4. Unsubscribe Reply consumer-initiated subscription. The only difference is that the producer requests the The unsubscribe reply message is sent in subscription instead of the consumer. response to an unsubscribe request consists of: 4.4 Querying for an Event 4.5 Requesting Available Events Often a consumer will want just one event Even though our architecture in Figure I from a producer. Instead of having a consumer shows a directory service that will be used to subscribe, receive 1event, and then unsubscribe, contain information on the events that are We allow a consumer to query a producer for an available from a producer, it is also convenient event. A query consists of a query request to be able to obtain this information from message that a consumer sends to the producer producers directly. and a query reply message that the producer sends to the consumer in response to the query 4.5.1. Event Names Request request message. The query reply includes the The available events request message is very event that was requested. simple and only contains a request ID. Here is an example EventNamesRequest: 4.4.1. Query Request <EventNamesRequest The query request message is very similar to xm!ns="ht tp: //www. gridforuxn, org/Per for mance/Protocol" requestID=" 15"/> the consumer subscribe request message and consists of: 4.5.2. Event Names Reply • A request ID (required). The event names reply messages consist of: • Event parameters element (required). • A request ID (required). • Any input parameters needed to • Return (required). generate events (optional). • ReturnDetail (optional). Here is an example QueryRequest message: • One or more Event elements that do not <QueryReques t xmlns= "http ://www. grid forum, org/Per for have values. Instead, they have two mance/Protocol" requestID= "15" > attributes: <Upt imeCPULoad o The namet attribute specifies the xmlns= "ht tp ://www. grid forum, org/Per for name of the available event. mance/Events" /> </QueryRequest > o The namespace attribute specifies the namespace. 4.4.2. Query Reply An example event names reply message is shown below. The query reply messages are similar to the event messages and consist of: <EventNamesReply xinlns ="http: //www. grid forum, org/Per for m A request ID (required) to identify mance/Protocol requestID=° 15 "> which QueryRequest this reply is for <Return>Success</Return> • Return (required). <Event name= "Upt imeC PULoad" namespac e= "ht tp: //www. grid forum, org/Pe • ReturnDetail (optional). rformance/Events" /> • The event data in the format described </Availabl eEvent sReply> in Section 3. 5 Performance An example query reply message is: <QueryReply In this section we present performance results xTalns ="http: //www. gridforum, org/Per form for two independent implementations of our ante/Protocol" requestID= °15"> protocol. One implementation uses Java and the <Return>Success</Return> <Upt imeCPULoadEvent Xerces XML parser. The other implementation xmlns ="http ://www. gridforum, org/Perform uses C++ and the expat XML parser. We ance/Events" > examined the performance of these <Loadl >I. 5</Loadl> implementations using a 933 MHz Pentium III <Load5>l. 6</Load5> system running RedHat Linux 7.1 with JDK 1.3. <Loadl5>l. 3</Loadl5> <TimeStamp>2000-11- We found that the C++ implementation is 09T21 :51 :45Z</TimeStamp> significantly faster. It can decode 4,300 uptime </Up timeCPULoadEvent > cpu load event messages a second to C++ </QueryReply> objects and encode 28,100 event messages a t r second from C++ objects. The Java [10] "The National Partnership for Advanced implementation can decode 600 event messages Computing Infrastructure," a second and encode 21,900 event messages a http://www.npaci.edu/. second. [11] "XML-RPC Home Page," http://www.x mlrpc.com. 6 Conclusions [12] D. Box, D. Ehnebuske, G. Kakivaya, A. This document describes an XML-based Layman, N. Mendelsohn, H. F. Nielsen, protocol for the transmission of performance S. Thatte, and D. Winer, "Simple Object events in a distributed environment. The Access Protocol (SOAP) 1.1," The World protocol we describe is a proposed standard in Wide Web Consortium 2000. the Performance Working Group of the Grid [13] I. Foster and C. Kesselman, "Globus: A Forum. The purpose of this protocol is to Metacomputing Infrastructure Toolkit," address the problem of providing performance International Journal of Supercomputing information in a standard way so that different Applications, vol. 11, pp. 115-128, 1997. tools can provide and use such information. We [14] I. Foster and C. Kesselman, "The Grid: require such interoperability in a computational Blueprint for a New Computing grid when we wish to analyze the performance Infrastructure,".: Morgan Kauffmann, of an application that uses several different 1999. resources. [15] A. Grimshaw, W. Wulf, J. French, A. We constructed two independent Weaver, and P. R. Jr., "Legion: The Next implementations of this protocol that Logical Step Toward A Nationwide interoperate. One implementation is written Virtual Computer," Department of using Java, and the other using C++. We found Computer Science, University of Virginia that the C++ implementation can decode CS-94-21, June, 1994 1994. messages significantly faster than the Java [16] D. Gunter and B. tTierney, "A Standard implementation but the encoding time is similar. Timestamp for Grid Computing." In Proceedings of the Global Grid Forum 1, References 2001. [1] "Condor High Throughput Computing," [17] T. Howes, M. Smith, and G. Good, http ://www.cs, wisc.edu/condoff . Understanding attd Deploying LDAP [2] "Data elements and interchange formats - Directory Services: MacMillan Technical Information interchange - Representation Publishing, 1999. of dates and times," International [18] W. Johnson, D. Gannon, and B. Nitzberg, Organization for Standardization ISO "Grids as Production Computing 8601, 1998. Environments: The Engineering Aspects [3] "The DOE Science Grid," http://www- of NASA's Information Power Grid." In itg.lbl.gov/Grid. Proceedings of the 8th IEEE International [4] "The GIobus Project," Symposium on High Performance http'//www.globus.org. Distributed Computing, 1999. [5] "Grid Forum," http://www.gridforum.org. [19] M. Litzkow and M. Livny, "Experience [6] "Grid Forum Performance Working with the Condor Distributed Batch Group," http://www- System." In Proceedings of the IEEE didc.lbl.gov/GridPerf/. Workshop on Experimental Distributed [7] "The Legion Project," Systems, 1990. http://www.cs.virginia.edu/~legion/. [20] A. Pope, The CORBA Reference Guide. [8] "The NASA Information Power Grid," Reading, MA: Addison-Wesley, 1998. http://www.ipg.nasa.gov. [21] W. Stallings, SNMP, SNMPv2, and [9] "The National Computational Science CMIP: The Practical Guide to Network- Alliance," Management Standards. Reading, http://www.ncsa.uiuc.edu/access/index.all Massachusetts: Addison-Wesley, 1993. iance.html.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.