ebook img

Local Thresholding on Distributed Hash Tables PDF

0.21 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Local Thresholding on Distributed Hash Tables

Local Thresholding on Distributed Hash Tables Abstract We present a binary routing tree protocol for distributed hash table overlays. Using this 3 1 protocol each peer can independently route messages to its parent and two descendants 0 2 on the fly without any maintenance, global context, and synchronization. The protocol n a J is then extended to support tree change notification with similar efficiency. The resulting 4 1 tree is almost perfectly dense and balanced, and has O(1) stretch if the distributed hash ] C table is symmetric Chord. We use the tree routing protocol to overcome the main imped- D iment for implementation of local thresholding algorithms in peer-to-peer systems – their . s c requirement for cycle free routing. Direct comparison of a gossip-based algorithm and a [ 1 corresponding local thresholding algorithm on a majority voting problem reveals that the v 6 7 latterobtainssuperioraccuracy usingafraction ofthecommunicationoverhead. 9 2 Keywords: DistributedHashTable, Local ThresholdingAlgorithms,Binary Tree . 1 0 Routing,GossipBased Algorithms,In-NetworkComputation,Chord, SymmetricChord 3 1 : v 1. Introduction i X r a In aworldofmillionsofwired devices,in-networkcomputationalgorithmsprovidean intriguingalternativetocentralization. Wheredistributeddataisabundantandbandwidthis limitedorcostly,someapplicationscanonlybeimplementeddistributively. Whereadverse manipulation and control are a concern, distributed architecture is often preferred over a centralized agent. Finally, scaling an algorithm to the millions of peers often teaches important lessons on asynchrony, speculative execution, and the containment of partial failure, whichproveimportantto moremundaneenvironmentssuchas grid systems. PreprintsubmittedtoElsevier January15,2013 Algorithms for distributed computation in peer-to-peer systems fall into several cate- gories. Ofthese,gossipalgorithmsarepossiblythemostpopularandcertainlythemostex- tensivelystudied[1,2,3,4,5,6,7,8,9]. Localthresholdingalgorithms[10,11,12,13,14] arecomparabletogossipbasedalgorithmsbecausebothaddresssimilarproblemsandsim- ilarly provide a proof for convergence. Local thresholding algorithms are considered by far more communication efficient than gossip based algorithms. However, they pose far stricterrequirementstotheunderlyingroutingprotocol. Agossipbasedalgorithmbasically requires an efficient way in which information can be propagated to random destinations. Incontrast,allknownlocalthresholdingalgorithmsrequirecyclefreerouting. Often,work onlocal thresholdingalgorithmsadvocates thataroutingtreebeinducedinpreprocessing. However, the non-trivial complexity of inducing and maintaining the tree in a dynamic networkhas so farrendered local algorithmsimpractical. Thiswork considerstheproblemofcomputationindistributedhash-table(DHT)over- lays – the de-facto standard architecture in peer-to-peer networks. Gossip algorithms can easily be implemented on a DHT: If each peer sends messages to a random peer from its fingertabletheninO(logN)messagesthisinformationwillarrivetoarandompeer. Local thresholdingcanbeimplementedinaDHTusingoneoftheexistingtreeroutingprotocols. However, existing tree routing protocols [15, 16, 17] are ill-fit for a local thresholding al- gorithm. Becausetheseprotocolsweredevelopedmainlytoreducemessageredundancyin broadcast or convergecast they operate in a top-down or bottom-up manner. Thus, a peer cannot send messages to its tree neighbors without the involvementof eitherthe root (in a top-downprotocol)oritsentiresubtree(inabottom-up). This paper makes two main contributions to the state-of-the-art: First, it presents new binary tree routing and change notification protocols for DHT overlays. This tree routing protocol is local and can be used for multi-way communication over the tree, including broadcast and convergecast. The effect of peer joining or leaving is also local and can be 2 detected and notified using no more than six messages that are routed on the tree. The Enabledbythebinarytreeroutingprotocol,thesecondcontributionofthispaperisadirect comparisonof local thresholdingand gossipbased algorithms. Ourexperimentsshow that regardless of system size or properties of the data, local thresholding vastly outperforms gossip. The results are so one sided that they call into question the continued relevance of gossipalgorithmstocomputationin DHToverlays. Therestofthispaperisorganizedasfollows: Thenextsectiondescribesthebinarytree routingprotocolandthechangenotificationprotocol. Section3detailstheimplementation of the two majority voting algorithms. Experiments are described in Section 4 and related work in Section 5. Finally, Section 6 draws conclusions and poses some further research problems. 2. LocalBinary Tree Routing The basic idea of the binary tree routing protocol is to define a mapping of peers to a subset of the nodes of a full binary tree. The binary tree can be defined in terms of a one-to-one mapping of d-long binary strings, namely addresses, to tree nodes. Then we definewhich peeris mappedtowhich address. Considerabinarytreewhoserootistheallzeroaddress. Any addressotherthanthatof theroot,wedivideintothreeparts: Anallzerosuffix,which mightbeempty,therightmost set bit, and a prefix, which might be empty as well. An address is therefore encoded as p10k,wherethelengthoftheprefix pisd−k−1. Wedefinethattheclockwisedescendant of the address p10k is the address p110k−1 and the counterclockwise descendant of the address p10k is p010k−1. Addresses ending with a set bit, i.e., p100, have no descendants. For completeness, we denote 10d−1 the clockwise descendant of the root. As can be seen inFigure2.1a,thismappingissimilar,butnotidenticalto,thetextbookimplementationof acompletebinary treeinan array. 3 Figure2.1: Binary Treerouting (a)Treemappingtotheaddressspace. (b)BinaryTreeRoutingonDHT A peer’s position in the tree depends on its assigned address space segment. The peer whose address space segment contains the all zero address takes the root position. Any other peer takes a position calculated as follows: Let the address space segment of p i be (a ,a ]. Let p be the (possibly empty), common prefix of a and a , such that i−1 i i−1 i a = p0X and a = p1Y. Then p takes the positionp10k. Note that messages routed to i−1 i i theaddresswhich isapeer’s positionwillalways beaccepted bythat peer. We conveniently denote the position with which p is associated as pos . We further i i denotetheclockwisedescendantofpos asCW [pos ]anditscounterclockwisedescendant i i asCCW [pos ]. Respectively,ifpos = CW [pos ]orpos = CCW [pos ],thenwedenote i j i j i pos = UP [pos ]. ThefunctionsCW,CCW,andUP canbecomputedforanypos using i j i bitmanipulations. The following two lemmas show that if there is more than one peer in the clockwise (or, respectively,thecounterclockwise)subtreeofapeer’s positionthenoneofthosepeers occupiesa positionwhichisa fore-parent ofthepositionsofall otherpeers. Lemma 1. The address space segment associated with the peers whose positions are the subtreebelowanypeer p is continuous. i Proof. SeeAppendixA. 4 Lemma2. Foranypeerp whosepositionispos ,oneofthepeerswhichoccupiespositions i i in thesubtreeofpositionCW [pos ](respectively, CCW [pos ])occupies apositionwhich i i isa fore-parentof thepositionsof alloftheotherpeers inthatsubtree. Proof. SeeAppendixA. Lemma2 can beput intodirect usetodefine abinary treeofpeers, ratherthan ofposi- tions: If all peers in the subtree of the address clockwise from p ’s position have positions i whichareinthesubtreeofp ,thenwedenotep theclockwiseneighborofp . Likewise, cw cw i the counterclockwise neighbor of p is the peer p , whose position is the fore-parent of i ccw all ofthepositionsinthecounterclockwisesubtreeof pos that areoccupied bypeers. i Itremainstodefinehowmessagescanefficientlyberoutedfromapeertoitsneighbors on the tree. The pseudocode of a protocol achieving this is detailed in Alg. 1. To deliver messages to the UP neighbor of a peer p , they are first addressed to UP [pos ] and then i i continuebeing routed to the UP [pos] of that address until they reach an address occupied by a peer. Clockwise and counterclockwise messages are first routed to CW [pos ] and i CCW [pos ]. Iftheyreachanaddressnotoccupiedbyapeer,thisisbecausethedestination i falls in the address space segment of a peer p occupying a different position. A new j destination,which is astep downthetree and away from thatofpos , is thuscomputed. If j thedestinationaddressexhauststheaddressspace, themessageisdropped. Forwardingamessageagainandagainuntilthedestinationisfoundortheaddressspace is exhausted is often wasteful and unnecessary. Whenever the address of the destination position falls in the address space of a peer who has a different position and is also a neighborofthesender,themessagecanbedropped. Thisisbecausethemessageisdoomed tobesentbackandforthbetweenthesenderandthereceiveruntileventuallybeingdropped as there is no peer between them to accept it. Fortunately, such communication patterns can easily be avoided if the sender denotes as part of the message header the edge of its addressspaceinthedirectioninwhichthemessageissent. Therecipientcanthencompare 5 Algorithm 1Local Binary TreeRouting Ondowncall to SENDwithmessageM anddirection d: Ifd isupward thendest ← UP [pos ]and edge ← null i Ifd iscounterclockwisethen dest ← CCW [pos ]and edge ← a i i−1 Ifd isclockwisethendest ← CW [pos ]and edge ← a i i Make a downcall to SEND with the destination address dest and the message hpos ,dest,edge,MiusingtheDHT i Onupcall DELIVER withthe messagehorigin,dest,edge,Mi: Ifdest = pos then call ACCEPT withthemessageM andfinish. i Ifdestis aforeparent of originthen newdest ← UP [dest]andnewedge ← null Elseifdestis intheclockwisesubtreeoforiginthen -Ifedge = a then finish i−1 -Iforigin = pos then newdest ← CW [dest]and newedge ← a i i -Elsenewdest ← CCW [dest]and newedge ← a i−1 Else -Ifedge = a then finish i --Iforigin = pos then newdest ← CCW [dest]and newedge ← a i i−1 --Elsenewdest ← CW [dest]and newedge ← a i - Make a downcall to SEND with the destination newdest and the message horig,newdest,newedge,Mi that to the edge of its own address space segment. If the edges are the same, the message can bedropped. Figure 2.1b illustrates this address scheme in a DHT composed of just nine peers and an address space of eight bits. For instance, peer number5, whose address space segment is (01110000,10011000], takes the tree position 10000000, and so forth. Messages routed counterclockwise from 9 first reach position 11010000, which is in the address space seg- ment of peer number 7. However, since the position of peer number 7 is 11000000, the messageisthenbouncedclockwisetoposition11011000,whichisoccupiedbypeernum- ber8. 2.1. Tree properties Thepropertiesofthetreewhicharethemostimportantareitsexpectedmaximaldepth, the expected degree of internal nodes and the expected stretch of a hop. The expected maximal depth is mostly important for broadcast and convergecast applications, in which 6 a message musttraverse thefull depth of thetree. The expected degree determines theex- pansionrate, whichiscentral forrepeated averagingalgorithmssuchas gossipalgorithms. In any algorithm, actual performance is proportional to the stretch – the number of actual messagesneeded to deliveramessagefroma treenodeto itsparent oritsdescendant. Lemma 3. Themaximaldepthof thetreeis O(logN) whereN isthenumberofpeers. Proof. SeeAppendixA. The stretch of the tree is defined in terms of the number of times the tree routing pro- tocol calculates a new destinationforan UP messageand lets theDHT route the message. Each such message may require up to logN IP messages. However, since the tree closely followsthefingertablelogic,symmetricChord peers willalmostalwayshaveadirect link to their CW, CCW and UP neighbors. Therefore, the number of IP messages required for everyDHTroutinginsymmetricChordisO(1). NoticethatmessagesintheCWandCCW directionfollowthesamepathonthetreeastheUP messageintheoppositedirection,and thereforehavethesamestretch. Lemma 4. Theexpected stretchof thetreeis a smallconstant. Proof. SeeAppendixA. 2.2. Neighbor changenotification The binary tree routing protocol in Alg. 1 defines neighbor relations logically and is therefore immuneto peerdynamics. Wheneverpeers join orleave the systemthe protocol simply reflects the change by delivering messages according to the current tree structure. However,somealgorithms,includingtheonein thenextsection,stillrequireexplicitnoti- fication whenoneofthetreeneighborschanges. The neighbor change notification protocol is based on the following property of the binary tree routing protocol: Let p be the successorof p and let their positions be pos i i−1 i and pos respectively. If p leaves the system then the position of p either remains i−1 i−1 i 7 Algorithm 2NeighborChangeNotification Definitions: Pos(a,b) isthepositionofa peerwhoseaddress spacesegmentis (a,b] On upcall NOTIFY that the predecessor address has changed from a to a or i−2 i−1 vice-versa: Compute pos = Pos(a ,a ) and pos = fix i−2 i var Pos(a ,a ) Pos(a ,a ) = pos i−1 i i−2 i−1 fix (Pos(ai−2,ai−1) Pos(ai−1,ai) = posfix Send the message hALERT,pos i in direction UP, CW and CCW from pos using fix fix binarytree routing. Send the message hALERT,pos i in direction UP, CW and CCW from pos using vol vol binarytree routing. Onupcall ACCEPTwiththe messagehALERT,posi: Ifpos isafore-parent of pos thendir ← upward i Elseifpos isin theclockwisesubtreeof pos then dir ← clockwise i Elsedir ← counterclockwise Notifytheapplicationofapossiblechangeoftheneighborin direction dir pos or changes to pos . In the former case, the parent of p becomes the parent of its i i−1 i−1 single direct descendant, if one exists. In the latter, p ’s former neighbors become the i−1 new neighbors of p and the former parent of p becomes the parent of p ’s former single i i i direct descendant. The sameproperty can provethat it is sufficientto alert thosesame five peers when p joinsthesystem. i−1 Lemma5. Theadditionorremovalofapeerp canonlyaffectthetreeconnectivityofonly i fivepeers whicharealltree neighborsofeither p oritssuccessor p . i i+1 Proof. SeeAppendixA. This property can be used to provide alerts on any single local change in topology. This is because when p leaves or joins the system, the DHT alerts its successor that its i−1 addressspacesegmenthaschanged. Oncep isinformedofthechangeintheaddressspace i segment it is able to calculate the positions whose neighbors might have changed. Hence, p can routealert messagesinall directionsfromthosepositions. i 8 3. MajorityVoting Given the infrastructure provided by the DHT overlay and by the binary tree routing and change notification protocols, we next compare representative local thresholding and gossip algorithms. We consider the simplest computation task: a majority vote. However, thetwoalgorithmswechoosearegoodrepresentationsoftheirrespectivefamilies. Weuse a variant of the local majority voting algorithm of Wolff and Schuster [10] and compare it to LiMoSense [9], which is a variant of the gossip averaging algorithm of Kempe et al. [1], suitable for dynamic data. Both algorithms were slightly adapted, and are therefore described inthefollowingsubsections. Theinputforboth algorithmsisa singlebit x ∈ {0,1}at each peer p and thecompu- i i tational task is to decide if on average most bits are one or zero. We realistically assume that the input of the peers can change at any moment, and thus that the algorithm never terminates. Theoutputofeach peerisan ad-hocassumptionon themajority. 3.1. Local majorityvoting In local majority voting, every peer bases its output on the statistics of votes it accepts from its tree neighbors – namely: UP, CW, and CCW. The peer stores for each of those neighbors two counter pairs: X and X for the upward direction, X and X UP,i i,UP CW,i i,CW for the clockwise, and X and X for the counterclockwise direction. Each of CCW,i i,CCW those counter pairs counts votes and the number of those votes which are of one. The counter pair X records the latest message received from direction v, and X the latest v,i i,v messagesent to direction v. Theyboth are initially (0,0). We convenientlydenote X = ⊥,i (x ,1) for the input of p . The knowledge of a peer is defined as the sum of all its inputs i i K = X . Whenever, according to its knowledge, the majority is of i d∈{UP,CW,CCW,⊥} d,i ones, P1,−1 tK ≥ 0, thepeeroutputsone. Otherwise,itoutputszero. 2 i To(cid:0)decid(cid:1)e when and which messages it must send, the peer computes for every di- 9 rection d ∈ {UP,CW,CCW} the agreement A = X + X . A violation occurs i,d d,i i,d when for a direction v ∈ {UP,CW,CCW} the sign of the agreement disagrees with the sign of the difference between the knowledgeand the agreement: 1,−1 tA ≥ 0 when 2 i,v 1,−1 t(K −A ) < 0 or 1,−1 tA < 0 when 1,−1 t(K (cid:0)−A (cid:1)) > 0. Such vio- 2 i i,v 2 i,v 2 i i,v (cid:0)lations(cid:1)can betriggeredby in(cid:0)itializa(cid:1)tion,by achangeo(cid:0)fthep(cid:1)eer’s vote,orby an incoming messagewhich changes oneoftheX . d,i To resolveaviolationtriggeredby theagreement withaneighborindirection v, apeer can send a message containing information on all of the votes received from neighbors in other directions. This is done by computing X ← K −X and sending X to the i,v i v,i i,v neighborinthedirectionv. Noticethatafterthismessageissent,A = K ,whichresolves i,d i theviolation. When a neighborin direction v changes, the pairs X and X no longer reflect mes- i,v v,i sages sent to or received from the current neighbor. Therefore, when a peer receives an alert of a change in direction v, it sets X to (0,0) and sends a message to that direction, v,i which sets A once more to K . The change detection protocol alerts the new neighbor i,v i as well. So the new neighbor will sending a message which reflects its own knowledge. Oncebothpeerssendandacceptthosemessages, A isagainequaltoA andreflects an i,v v,i agreement between p and itsnewneighbor. i Notethatifp doesnothaveaneighborindirection v, thenX remainszero anddoes i v,i not affect K or the result. Messages sent by p in direction v would be dropped by the i i binary treeroutingprotocol,butthiswouldnotbeindicatedto p . Wepreferwastingthose i messages to complicating the protocol with NACK messages. Additionally, note that to supportthepossibilityofoutof ordermessagedelivery,a sequentialnumberisattached to each outgoing message and a message is dropped when it arrives after a message which was sentsubsequently. 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.