TowardsReliableandScalableRobotCommunication Towards Reliable and Scalable Robot Communication AndreeaLutac,NataliaChechina,GerardoAragon-Camarasa,andPhilTrinder SchoolofComputingScience,UniversityofGlasgow,UnitedKingdom [email protected], {Natalia.Chechina, Gerardo.AragonCamarasa, Phil.Trinder}@glasgow.ac.uk Abstract MostofthedevelopmentworkbehindROScomesthroughcontri- butionsfromtheinter-disciplinaryroboticscommunity,withstrong TheRobotOperatingSystem(ROS)isthedefactostandardplatform academicrepresentation.ROShasgrownorganicallytosupporta formodernrobots.However,communicationbetweenROSnodes largerangeofcompatiblerobots.However,ROSexhibitsanumber hasscalabilityandreliabilityissuesinpractice.Inthispaper,we ofscalability,inter-processsynchronization,andreliabilitylimita- investigatewhetherErlang’slightweightconcurrencyandreliability tionsinpractice.Whiletheselimitationsarecommonknowledge mechanismshavethepotentialtoaddresstheseissues.Thebasis amongROSusers,theyhavenotbeenwidelyexploredorreported oftheinvestigationisapairofsimplebuttypicalroboticcontrol intheliterature. applications,namelytwoface-trackers:oneusingROSpublish/sub- Erlang [3] offers the right combination of capabilities for scribemessaging,andtheotherabespokeErlangcommunication robotics,namelyrealtimecapabilities,lightweightscalablecon- framework. currency,andworld-leadingreliability.Inconsequencetherehave We report experiments that compare five key aspects of the beennumerousacademicandindustrialprojectsinvestigatingthese ROSandErlangfacetrackers.WefindthatErlangcommunication synergies.TheseprojectsmainlyuseErlangtocontrolrobotsor scalesbetter,supportingatleast3.5timesmoreactiveprocesses robotcomponentse.g.[4,25],orteachErlangusingrobots[12] (700processes)thanitsROS-basedcounterpart(200nodes)while (Section2.2). consuminghalfofthememory.However,whilebothfacetracking Theapproachpresentedhereisnovelininvestigatingwhether prototypes exhibit similar detection accuracy and transmission Erlang has the potential to address the scalability and reliability latencieswith10orfewerworkers,Erlangexhibitsacontinuous issueswithcommunicationbetweenROSnodes.Thatis,wedon’t increaseinthetotaltimetakentoprocessaframeasmoreagents replacetheentireRobotOperatingSystem,onlythecommunication are added, and we identify the cause. A reliability study shows layer.Weinvestigatethehypothesisbycomparingthescalability that while both ROS and Erlang restart failed computations, the andreliabilityoftwoanalogousmiddlewaresystemsinatypical Erlangprocessesrestart1000–1500timesfasterthanROSnodes, roboticcontrolcontext.Thecontextisreal-timefacetracking,and reducingrobotcomponentdowntimeandmitigatingtheimpactof themaintasksaretoextractastreamofvideoframesfromacamera, thefailures. performfacerecognitionwithanumberofclassifierworkers,and CategoriesandSubjectDescriptors D.3.3[LanguageConstructs toredirectthecameratofollowthefacelocation.TheROSface andFeatures]:Frameworks trackercommunicatesusingpublish/subscribeonROStopics,while theErlangfacetrackerusesabespokeErlangframework. Keywords Robotics,ROS,Erlang,faulttolerance,scalability WestartwithanoverviewofROSandreviewpriorworkusing Erlanginrobotics(Section2).Wepresentthedesignandimple- 1. Introduction mentationoftheROSandErlang-basedfacetrackers(Section3). Weconductfiveexperimentsthatcomparedifferentaspectsofthe Modernrobotsaretypicallyhighlyconcurrent,complexsystems ROSandErlangfacetrackers.Namelywecomparetheirscalability thatrequireintensiveinteractionbetweenvarioushardwareandsoft- (maximumnumberofconcurrentworkers:ROSnodesorErlang warecomponents.Manyoftheservicescrucialtorobotoperation, processes);messagelatency;qualityofreal-timefacerecognition; suchassensormonitoring,motorcontrol,orgripperactuation,rely reliabilityinthefaceofROSnodeorErlangprocessfailuresinduced onreliableandconcurrentcommunicationbetweenhardware,soft- byaChaosMonkey[27];andtheimpactoffailuresonfacematch wareandintelligencecomponents.Thisinteractioniscommonly quality(Section4). mediatedbytheRobotOperatingSystem(ROS)[23],astateofthe artroboticmiddleware.ROSsupportsconcurrentsynchronousand asynchronousprocessescommunicatingbymessagepassing,and 2. Background allowsroboticiststocomposesystemsfromheterogeneoushardware andsoftwarecomponents. Since antiquity humans have been interested in automation, and therehavebeennumerousattemptstoconstructautonomousme- chanical systems. While theearly daysof robotics didnot yield muchinthewayoffullautonomy,sincethesecondhalfofthe20th century,roboticshasgraduallyexpandedtoprovideessentialtools inarangeofdomainslikemanufacturing,health-care,andspace exploration. Robotvisionisaparticularlyintriguingfacet,asitistheprincipal meansthroughwhichintelligentsystemsanalysetheirsurroundings. In the case of robotic systems, the capture and interpretation of visualstimuliisdoneviaartificialsensorssuchasdigitalcameras. [Copyrightnoticewillappearhereonce’preprint’optionisremoved.] Itisevidentthathumanbrainshaveevolvedtobeveryefficientat TowardsReliableandScalableRobotCommunication 1 2016/7/29 differentAPIsthatwouldnormallybeneededtoaccessrelevant systeminformation,suchassensordataandactuatorpositions.How- ever,asthenumberofnodesgrowsscalabilityissuesarise.Given the fact that ROS adopts a graph model to store and manage its networkofnodes,eitherasubstantialincreaseinthenumberofcon- Figure1. ExampleofaROSNodeGraph nectionsortheformingofacompletegraph(inessenceanall-to-all connection)islikelytoaffectitsperformance.Inaddition,ROS’ inter-processsynchronisationmechanismsarebasedontimestamps, visualdataprocessingtaskssuchasfeatureextractionandobject whichcaninducefailureincertainprocessesifthetimestampsare detection.However,thesetasksremainchallengingforcomputer notreceivedinorder[6]. systems,andsolutionsoftenstressphysicallimitationsinmemory, Furthermore,ROSprovideslimitedmechanismstodetectfail- clockspeed,datatransferrates,andtheenergyconsumptionofthe uresandrestartnodes.Forexample,ROSfault-toleranceconsists hardwaretheyutilise[16]. ofrestartingaprocessonlywhenitterminatesabruptly,without consideringnetworkdisconnectionsorruntimeerrors.Thefailed 2.1 RobotOperatingSystem ROSnodepoisonsthesystem,e.g.allotherROSnodesthatexpect Modernrobotsarecomplexheterogeneoussystems,andinrecent datafromthefailednodecanpotentiallystopworking.Thetypical years more and more robotic applications have started to make solutionismanualfailuredetectionandreboot.AllthismakesROS extensive use of the common platform provided by the Robot notsuitableforautonomouslong-termoperations.Industrialand Operating System (ROS) [23]. Originally developed at Stanford academicroboticistsareawareof,andconcernedabout,thelackof ArtificialIntelligenceLaboratory,theprojectoffersclientlibraries reliableandscalablecommunicationsinROS,butthesedeficiencies mainlyinC++(roscpp[24]),Python(rospy[5])andLISP,butalso haveremainedlargelyundocumentedintheliterature. experimentallyinotherpopularlanguagessuchasJava,JavaScript, Haskell,andRuby[21]. Sinceitsconceptionin2007,ithasbecomethemostwidelyused 2.2 ErlanginRobotics platformforrobotcontrolinbothacademicandindustrialsettings, Erlang[3]clearlyprovidestherightcombinationofcapabilitiesfor asmoresystemsbegantotakeadvantageofitsmodularity,inter- robotcontrol,namelyrealtimecapabilities,lightweightscalable platformcommunicationcapabilitiesandin-builtsupportforawide concurrency,andworld-leadingreliabilitymechanisms.Hence,there rangeofhardwarecomponents. areanumberofacademicandindustrialprojectsseekingtoexploit The main aim of ROS is to provide operating system-like itscapabilities. functionalities,forinstance: One of the first projects was conducted at the University of • hardwareabstraction,whichenablesthedevelopmentofportable Catania in 2005 [25]. The goal of this work was to produce a codewhichcaninterfacewithalargenumberofrobotplatforms; completeroboticframeworkinErlangforasetofautonomousrobots participatingintheEurobotcompetition[18].Theprojectgreatly • low-leveldevicecontrol,facilitatingroboticcontrol; emphasisedsystemmodularity,whichmeantthattheframework • inter-processcommunicationviamessagepassing; wasconstructedusingalayeredarchitecturetoseparateeachprocess concern, starting with the low-level control layers that govern • softwarepackagemanagement,whichensurestheframeworkis physicalmovementsandinterfacewithrobothardware,tothehigher- easilyextensible. levelinterpretationlayers,suchasplanningandreasoningfunctions, Ofparticularinterestamongsttheseistheprocessmanagement andcontrollogic.Erlang’sconcurrentfeatureswerealsoutilised aspect.Indeed,theheavyrelianceofROSonitsmessage-passing bothintheAIaspectsoftheprojectandinthetaskofefficiently communicationinfrastructure[20]iswheremostofitsperceived parallelizingtherobot’scontrolloops.While,traditionally,these benefitsanddrawbackslie.Processesaretermed“nodes”inROS, moduleswouldhavebeenimplementedinlanguageslikeLISP[19] as they form part of an underlying graph (Figure 1) that keeps andC/C++, theteamasserts thatErlangcansuccessfully merge trackofnodesandallthecommunicationsbetweenthematafine- bothlanguage’scapabilities.TheteamlatercreatedROSEN[26], grainedscale.Sincemostroboticsystemsareassumedtocomprise a robotic framework and simulation engine developed in Erlang numerous components, all requiring control or I/O services, the similarly designed to separate generic functionalities from the ROSnodearchitectureexpectseachcomputationtobedelegated idiosyncrasiesofspecificroboticsystems. to a separate node, thus reducing the complexity of the control The RT-Erlang project was conducted in 2009–2010 at the systems. At the core of all the programs running on ROS is an NationalInstituteofAdvancedIndustrialScienceandTechnology anonymizedformofthepublisher-subscriberpattern[7],i.e.ROS (AIST)inJapan[4].Theinitialgoalofthisprojectwastodevelop nodescommunicateanonymouslyduetothefactthatconnections anErlangtoolforthemonitoringandorchestrationofanetwork arenotformedbetweennamedpairsofnodes,butratherchannels ofOpenRTM-aist[2]components.MuchliketheROSframework, themselvesarenamed,andthenanytopiccanpublishorlistento OpenRTM-aistisanopensourceroboticmiddlewarewhichprovides thechannels. communicationservicestosensorandcontrolprogramsdesignated Atthestartofitsexecution,anodeisexpectedtofirstregister asRT-components.Followingthecompletionofthistool,amuch withthemasterROSservice,roscore.Themasternodeisthecore more in-depth investigation of Erlang’s use in low-level robotic ofaROSsystem,responsibleforprovidingnamingandregistration control was launched, culminating with the development of a tonodesconnectedtoit.Itsprimaryfunctionalityistoactasanode completere-implementationofOpenRTM-aistinErlang. discoveryhub,whichmeansthatafternodeaddresseshavebeen Both Erlang Framework and RT-Erlang projects emphasised relayed as needed, inter-node communication can be done peer- the importance of ensuring proper functioning of separate, yet to-peer.Newly-creatednodesarethereforeabletocommunicate dependentsoftwarecomponents.Incaseoftheformerproject,these through unidirectional topics, request-reply RPC services, and a wererepresentedbythevariousrobotcontrolloopsthatpasseddata globally-viewableparameterserverforstaticvariables. fromonetotheother(e.g.wheelcontrolandspeedsensing),whilein TheROSmessage-passingarchitecturesupportstheeasyinte- theRT-Erlangproject,thesecameintheformofintercommunicating grationofcustomusercodewithinitsecosystemandunifiesthe RT-components. TowardsReliableandScalableRobotCommunication 2 2016/7/29 (a) ROS (b) Erlang Figure2. ArchitecturesoftheROSandErlangFaceTrackingPrograms TherehavebeenindustrialexperimentsusingtheErlangfamily inrealtimeandthenrelaystheresultstocamerapan-tiltactuators oflanguagesforrobotcontrol,e.g.aUSbasedsoftwaredeveloping toachievefacetracking.Thedecisionofusingapan-tiltcameraas company Isotope11 developed drone demonstrators [1] in 2014 anarchetypalroboticcomponentisbasedontwomajorfactors.The usingElixir[14].Thiswasalowcostsideprojectthatdemonstrated firstoftheserelatestotherolethatvisionplaysinrobotics,which howtouseErlangandElixironAndroidmobileplatforms. makesimagingsensorsarelativelyubiquitouspieceofhardwarefor Wheretheforegoingprojectsmainlyseektoimplementrobot autonomousrobots.Thesecondfactorisduetothestraightforward controlpredominantlyinErlang,ourapproachisnovelinpreserving way in which image manipulation programs can be distributed ROSasacommonplatform,andreplacingonlythecommunication andparallelized,thusfacilitatingthecompletionofscalabilityand layerbetweenROSnodes. reliabilityexperiments. Erlanghasalsobeentaughtthroughrobotics,providingarobotic SoftwaremodulescommontobothROSandErlangfacetracking platform.TheprojectwasacollaborationbetweenRWTHAachen programscomprisethefollowing[17]. andtheUniversityofKent[12].Theproject’sgoalwasprimarily • Off-the-shelfOpenCVfacedetectionandlocalisationdetectors. educationalandfocusedonbuildinganinteractiveframeworkto Here, we use in-built OpenCV Haar feature-based cascade teachErlangviaroboticsimulation.Thesoftwareartefactproduced classifiers[22]. asaresultoftheprojectwasKERL[11],theKentErlangRobotics Library.HereErlanguserlevelandmiddlewaremodulescommu- • SoftwaredrivertocontroltheLogitechQuickCambasedonthe nicate with a robotic driver implemented in C++, which in turn outputfromthefacedetectormodules. makesuseofPlayer[9],across-languagerobotdeviceinterface • Userinterfacetoreportthestateofthesystemandpresenta foravarietyofcontrolandsensorhardware,andStage[29],a2D viewoftheimagescurrentlybeingcaptured,annotatedwithany simulationenvironmentsimulator. detectedfacesandtrackingdata. The combination of these technologies created a framework which,inspiteofbeingaresearchprojectwithlessconventional Figure2providesanoverviewofcomponentsandinteractions applications, was nevertheless successful in providing a starting ofthefacetrackingprogramimplementedinROSandErlang.To pointforroboticsinErlang.Incontrastweneitheraimtobuilda facilitateacommonsoftwarearchitectureinbothversions,weuse roboticplatform,norusethetechnologyasanErlangteachingtool. Python(highlightedinyellow)thatperformsthefollowing:image acquisition,facedetection,framedisplay(toinformallyvalidate tracking results), and camera operation. By having a common 3. DesignandMethodology and homogeneous software base, it allows us to focus on the 3.1 CommonDesign communication capabilities of ROS and Erlang while keeping remainingfunctionalityunchanged. TheROSandErlangfacetrackingsystemsfollowasimilarclient- Thekeyaspectexaminedinthisdesignwasdistributionoftheim- servermodel,andforsimplicitywedescribeasinglearchitecture. ageframestotheworkernodes/processesinbothimplementations. The system comprises an image capturing device and pan-tilt Weconsideredtwoalternatives: actuators.Theimagecapturingdevice(apan-tiltenabledLogitech QuickCam(cid:13)R camera1)performsfacedetectiononframescaptured 1. random distribution of frames between workers (effectively ensuringframesareneverprocessedmorethanonce); 1http://support.logitech.com/en_us/product/ 2. broadcastingeveryframetoallworkernodesandsynchronise quickcam-sphere-af theirdetectionresultsattheaggregationstage. TowardsReliableandScalableRobotCommunication 3 2016/7/29 Whilethebroadcastmethodincursnegligibledetectionaccuracy pingframesretrievedfromthecamera.TheInt32Numpy,onthe penalties,andhenceisabetterfitforasystemwhichperformsa otherhand,isacustommessagetypecreatedsolelyforthepurposes mean-basedfacedetection,italsopresentssomedownsides.Thatis, ofthisapplication,whichwrapsasetoffourintegersoftypeint32, apartfromincreasingthevolumeofmessagesbyafactorofn(where anumericaltypespecifictotheNumpyPythonlibrary[10]. nisthenumberofworkers),suchreplicationalsorequiresadditional Havingdefinedtheabovemessagepassingprotocols,themain code that can determine when all of the results for a particular nodeproceedstoopenacontinuouscamerafeedusingOpenCVand frame have been received at the aggregator before actuating the randomlyselectsaframetopictopublisheachretrievedframe. camera.Duetothecomplexityandaddedlatencyofthebroadcast ThesecondmajorcomponentoftheROSapplicationresidesin method,weemployrandomframecastapproach.Broadcast-based thecodeusedbyallworkernodeinstances.Thismodulefeaturesan distributionishoweverusedbrieflyinthefacequalityexperiments initialization process similar to the main node, registering itself (Section4.3),whereitenablestoexposeaflawinROS’message under a unique name, face_tracking_i (where i ∈ [0;n)), synchronisationprotocol. definingthetopictowhichitintendstopublishtheresultsofface detection,andfinallysubscribingtoitsdedicatedframetopic.The 3.2 ROSinterface nameofthecallbackfunctionforthelatteroperationisinfactthe The ROS face tracking program has two types of ROS nodes nameoftheworkermodule’sonlyfunction,aggregate_faces, (Figure2(a)):mainandworker.Themainnodeactsasaninterfaceto thatcarriesoutfacedetection. thecameraandisresponsibleforacquiringimagesfromthecamera Onceaframeispublishedtotheframetopic,thefacedetection and distributing them to the worker nodes. The communication functionbeginstoprocessit,callinguponthedetectMultiscale() betweenthemainnodeandasetofworkernodesisasynchronous. functionfromtheOpenCVlibrary.Thefunctionappliesaclassifier Theworkernodeswaitforframeinputsandsubsequentlydetect toasuppliedimageandreturnsalistofrectanglesdenotingdeduced facesonthegivenimageframe.Detectedfacesfromeachworker facepositions.Eachoftheserectanglesiscomposedofthefollowing nodearesentbacktothemainnode,wherethefacecoordinates fourNumpyintegerelements,whichaimtocharacterisethelocation (representedasarectangularbox)areaveragedtoobtainasingle ofafacewithrespecttotheimageplane: facedetection.Thecentreoftheaveragedrectangularboxisthen • x_coordisthetopleftx-coordinatepointofarectanglethatthe usedtocomputethephysicaladjustmentrequiredforthecamera detectorassumescircumscribesaface; gazetocentretothenewfaceposition. ThecoreoftheROSfacetrackingapplicationisconfinedtoa • y_coordisthetoplefty-coordinatepointofthesamerectangle; singlePythonscript.Thisnodeinitialisesitsoperationsbyregis- • widthisthewidthoftherectangle; teringitselfwiththeROSmasternodeunderauniquenodename, • heightistheheightoftherectangle. face_tracking_main.Thisisfollowedbythenodeestablishing theparametersofitstopicinteractions(Listing1).Thatis,themain Toensurethatonlyonefaceisreturnedasadetectionresult,thelist nodecreatesntopics,camera_frames_i,wherei∈[0;n)–one ofrectanglesiscondensedtoasinglevalue,i.e.thelargestrectangle, foreachworkernodeitexpectstoengage–andreservesthemforthe whichisduetothenatureofthecascadeclassificationprocessthat publishingofImage-typemessages.Similarly,thenoderegistersas considersittobethelikeliestfacecandidate.Theresultingfacedata asubscriberfornfacetopicscarryingInt32Numpy-typemessages. isthensentbacktothemainnode. Themost importantparameterof thesubscriptionprocess isthe Thefunctiondesignatedtoreceivefacesatthemainnodeisthe callbackfunctionname,whichspecifieswhichofthemainnode’s aggregator,whoseroleistomanagethelargeinfluxoffacereadings functionsaretobeinvokedwhenamessageispublishedtoanyof and determine what movements the camera should execute. For itssubscribedtopics.Inthisparticularcase,thecallbackfunction this,weemployafinite-sizedequeue(double-endedqueue)data forallfacetopicsissettoafunctionresponsibleforaggregatingthe structureofvariablecapacity[13].Declaredasaglobalparameter, facesandissuingcamerainstructions. the dequeue is gradually filled with each incoming face until it reachesitsmaximumnumberofelements,atwhichpointtheoldest Listing1. InitializationofROSMainNode elementisdiscardedandthedequeue’s“tail”isshuffledbackwards 1 rospy.init_node(’face_tracking_main’) tomakespaceforanewface.Thisslidingwindowapproachtoface 2 cap = cv.VideoCapture(device_ID) storageisthenpairedwithanaggregationoperation,wherebythe 3 # Create frame topics facecoordinatescurrentlyinthedequeueareaveragedtoproduce 4 publishers = [] a single “best” face reading. Consequently, these features help 5 for i in xrange(expected_workers): circumventtheissueofconflictingconcurrentcameraaccessand 6 topic = ’camera_frames_’+str(i) ensurethattheresultingfaceisareasonablereflectionofthereal 7 publishers += [rospy.Publisher(topic, Image, facepositionatthetimeofaggregation. queue_size=1000)] Lastly,aROSlaunchfileisusedtostartbothmainandworker 8 nodesoftheapplication,aswellastheroscorenodewithwhich 9 # Subscribe to face topics 10 subscribers = [] allactivenodesareregistered.ThisXML-formattedfileservesasa 11 for i in xrange(len(publishers)): configurationspecificationfortheprogram,informingROSofwhere 12 topic = ’camera_frames_’+str(i) thenodesarelocated(i.e.whichpackageandwhichhost/machine), 13 subscribers += [rospy.Subscriber(topic, whatuser-definedparameterstheyrequire,andwhethertheiralive Int32Numpy, aggregate_faces, callback_args=[cap statusshouldbemonitored.Thislastaspectisparticularlysignificant ])] whenitcomestoprotectingthesystemfromunexpectedfailures Notably,thefacetopicsneedtoexistforthesubscriptionprocess (Section4.4). tosucceed,asthecallbacksystememployedbyROSensuresthat 3.3 Erlanginterface subscriber-topic interaction only occurs when a message is pub- lished/received.Regardingthetwoimagemessagetypesdescribed The Erlang face tracking program is conceptually similar in its above:ImageandInt32Numpy.TheImageisanin-builtROSmes- designtotheROS-basedarchitecture,yetrequiresmoremodules sagetypethatenablesanefficientencodingofpixelmatricesfor writteninbothPythonandErlang(Figure2(b)).Thisaddedcom- publishingtoaROStopicandis,therefore,highlysuitedforwrap- plexityisduetoususingErlangonlyfortheimplementationof TowardsReliableandScalableRobotCommunication 4 2016/7/29 Table1. ROSandErlangCodeComparison ROSinterface Erlanginterface Processing Communication Reliability Processing Communication Reliability Modules 2 1* 1(ROSlaunchscript) 2** 2 3supervisors Linesofcode 194 1 22+ 104 139 81 Total 200+linesofcode 300+linesofcode *CommunicationishandledimplicitlybyROS,butthecustomimagemessage(Int32Numpy,Listing1)wasaddedtosupportfloat32images **OnePythonmoduleforprocessingandoneErlanginterfacemoduletostarttheapplication the message-passing middleware components. In ROS, however, tertocommunicatewithPythonfunctionsoutsideoftheVM,with message-passingisprovidedthoughlibrariesasdiscussedinprevi- theexceptionthatinthiscasetheserverpatternmatches{face, oussections. Face}messagesreceivedfromallofthePythondetectorsandsends MirroringtheimplementationoftheROSfacetrackingprogram theaggregatedfacepositiontoaPythontrackingcontrolfunction. (Section3.2),theErlangfacetrackingprogramisalsostructuredas Aprominentfeatureoftheaggregatorserver’sstateisthatitisalist twoprimarymodules:onecorrespondstoeachinstanceofPython datatype,inwhichthefirstelementisitsassociatedPythoninstance workerprocessesandtheotherencompassestheaggregationoper- andthesecondisaqueuestructure(sharedamongsttheserver’s ation.However,thesemodulesareexclusivelytaskedtoperform functions)anddesignedtostorethefacesbeforeaggregation.While message-passing.DivergingfromROS’free-formcodedesign,the Erlangdoesnotofferabuilt-indequeuedatastructureinthesame twoErlangmodulesfollowstandardErlanggen_serverbehaviour wayPythondoes,itsfunctioningissimulatedthroughadditional thatmodelstheserversideoftheprogram’sclient-servercommuni- checksthatprobethequeueonfullnesseverytimeanewelementis cations.Accordingly,clientsarerepresentedbyPythonfunctions added,andremovetheoldestelementifthequeueisfull.Thatis, thatdealwithimageacquisitionandprocessing,aswellascamera onceafacemessageiscasttotheaggregator,theserverpushesit interaction.NotethatthefunctionalequivalenceoftheROSand ontothequeueandcomputesthesamemean-basedgroupingopera- Erlangapplicationsconsiderablyfacilitatescodereusebetweenthe tionasintheROSframework.Thelaststepoftheprocessinvolves twofacetrackingsystems,meaningthatthePythonmodulesde- therelayandtranslationofthesefinalfacecoordinatesintousable velopedforuseinROSspacerequiresminorre-structuringtobe cameracontrolinstructions,viaacalltoaPythonfunction. portedasErlangclients.Sincetheseimplementationsarealready ThereliabilityintheErlangimplementationissupportedbyasu- coveredintheprevioussection,wedonotgointofurtherdetailon pervisiontree(Figure3).Thetreeconsistsoftwomainsupervisors thePythonfunctions,insteadweproceedtotheErlangmodules. thatmonitortheexecutionofthefaceserverprocessesandtheaggre- The face_server module listens for incoming image frame gatorserverprocess,respectively,andonetop-levelsupervisor.The messagesfromthePythoncameradriverandrelaysthemtoaPython top-levelsupervisoroverseesthefunctioningofitsdirectchildren. facedetector.Whilethegenericserverspecificationprovidescodefa- Allofthetree’sverticesareconfiguredaspermanentchildren,im- cilitiesforbothsynchronousandasynchronousmessaging–through plyingtheymustalwaysberestarteduponfailure.Thesupervisor thehandle_callandhandle_castfunctions,respectively–only modulesoffermorecustomisabilitythantheROS’.launchscript, thelatterisusedtoresembleparallelROS’topic-basedinteraction allowingfordifferentrestartstrategiestobeemployed.Thisimple- (Listing2).ForthePythoninteractiontotakeplace,attheinitializa- mentationmakesuseofthestrategycalledone_for_one,whereby, tionthefaceservercreatesaninstanceofaPythoninterpretervia inthecaseoffailuresoccurringinamulti-childsupervisor,onlythe theErlPortlibrary[28].Areferencetothisinstanceisthenpassed failedchildprocessesarerestarted. toeachoftheserver’sfunctionsasa“state”parameter. 3.4 ROSvs.ErlangImplementationComparison Listing2. gen_serverFunctionsfortheface_server Throughanempiricaltestingandsubsequentextensiverefinements 1 % Asynchronous handler for incoming frames; wehavedeterminedthatthetwofacetrackingprograms–ROSand 2 % frames are pattern matched to ensure correctness Erlang–arefunctionallyequivalent.InTable1wesummarisethe 3 handle_cast({frame, Frame}, PyInstance) -> implementationdetails.ItshowsthattheErlangimplementation 4 % Call the Python detector, supplying the frame requiresmorecodefacilitiesthanitsROSanalogue.Whilethismay and the Erlang Process ID 5 python:call(PyInstance,facetracking,detect_face,[ Frame,self()]), 6 {noreply, PyInstance}; Top-level supervisor 7 8 % Synchronous message handler; 9 % set to issue no replies 10 handle_call(_Message, _From, PyInstance) -> {noreply , PyInstance}. Face detectors’ Aggregator’s supervisor supervisor Whenever a message is sent (or cast) to the face server, the servermatchesthemessageagainstthe{frame, Frame}pattern. Thehandle_castperformsacalltothePythondetectorfunction viathePythoninterpreter,wherethecallprocesscanpasstoPython functionsparametersofanytype.TheErlangcalleronlyresumes operationoncethePythonfunctionreturns.Eachfaceserverhas … Erlang aggregator process its own associated Python interpreter, thus the server actors can easilyrunindependentlyfromoneanother.Asimilarmulti-threaded behaviourisobservedinROSnodes. Erlang face detecting worker processes Likewise,theaggregatormodule,aggregator_server,makes useofitsdefinedgen_serverbehaviourandaserverstateparame- Figure3. SupervisionintheErlangFaceTrackingApplication TowardsReliableandScalableRobotCommunication 5 2016/7/29 Table2. ExperimentParameters VariableName VariableDescription Value VariablescommontobothROSandErlang The name under which the operating system identifies theLogitechQuickCam(cid:13)R cameraport.Theportlabelling DeviceID 0or1 is the choice of the OS and thus this variable must be manuallypassedtotheprograms. ThenameoftheHaarfeatures-basedcascadeclassifier CascadeClassifierID haarcascade_frontalface_1 usedduringtheOpenCVfacedetectionstage. The list of arguments passed to the detection function: scaleFactor (how muchthe imagesize isreduced at eachstep),minNeighbors(theconfidenceoffacedetec- scaleFactor=1.1,minNeighbors= OpenCVDetectorArguments tionwherehighervaluesmeanamorerigorousselection 6,minSize=(50,50) process),minSize(facessmallerthanthisminimumrect- anglesizeareignored). Thesizeoftheslidingwindowtheaggregatorshouldtake intoaccount.Thisparameterwasexperimentallyfound AggregatorDequeueLength toyieldthebestresultsatlowvalues(Section4.3),since 5 havingmorefacestoaggregatecausestheaverageface valueto“lag”behindthereal-timeposition. ROS-specificVariables 5Hz,asimposedbyrospymessage PublishingRate Therateatwhichmessagesarepublishedtoatopic. transmissionlatency. The number of messages temporarily stored by rospy ifitcannotimmediatelypublishallofthemtothetopic. TopicQueueSize 5 Generallyrecommendedtobethesameasthepublishing rate. Erlang-specificVariables Child configuration parameters: restart, shutdown, restart=permanent(childrenare SupervisorChildSpecification type. alwaysrestarteduponfailure), shutdown=20ms(childprocessesthat donotrespondwithin20mstoanexit summonsareautomaticallyterminated), type=worker/supervisor one_for_one(ifonechildprocess Identifieschildprocessesthatarerestartedincaseofa SupervisorRestartStrategy terminatesandshouldberestarted,only failure.Quantifiestheimportanceofpartialfailures. thatchildprocessisaffected) Thenumberofrestartsallowedwithinaspecificperiodof MaximumRestartIntensity timebeforetheentireprogramisdeemedfaultyandshuts 1000restartsin1second down. appeardisadvantageous,thediscrepancyisduetothefactthatErlang • ReliabilityinthefaceofROSnodeorErlangprocessfailures isonlyadevelopmenttool,ratherthanafully-developedecosystem inducedbyaChaosMonkey[27](Section4.4). suchasROS.Hence,communicationandreliabilityfacilitiessuchas • Variationsoffacequalitywithworkerfailures(Section4.5). topicinteractionandtheroslaunchpackage2thatareincorporated into ROS, had to be implemented in the corresponding Erlang TheexperimentsareconductedonanInteli7-4700HQprocessor program. (4 cores, 2 threads per core), at 2.4 GHz, with 16GB DDR3 in RAM,running64-bitUbuntu15.04,ROSIndigo,Erlang/OTP18.0, 4. Experiments andOpenCV3.0.Table2summarisestheparametersusedinboth programs.Thesehelptoeliminatebiasinfavouroragainstoneof ToinvestigateErlangpotentialtoaddressthescalabilityandrelia- theprogramsandensureresultsderivedfromeachtestarevalid bilityissueswithcommunicationbetweenROSnodesweconduct and meaningful. Below, we provide further justification for the thefollowingexperimentsthatcomparetheROSandErlangface assignmentofparticularvalues. trackers. • scaleFactor defines percentage of up-scaling between two • Scalability measured as the maximum number of concurrent consecutivelevelsoftheimagescalepyramid.Here,1.1corre- workers:ROSnodesorErlangprocesses,andassociatedmemory spondstoa10%differenceinscale,which,forthepurposeof consumption(Section4.1). theexperiment,isanappropriatetrade-offbetweenperformance • Messagelatency(Section4.2). (speedofclassification)andthoroughnessofdetection. • Variations of face quality with a real-time aggregation (Sec- • minNeighborsaffectsthequalityoffacedetection.Thehigher tion4.3). thevalue,thebetterthequality(oraccuracy)ofthedetectedface, butthelessresults(faces)areobtainedfromanimage.Wesetit 2http://wiki.ros.org/roslaunch to6asweonlyexpectthedatasettohaveonefaceperimagebut TowardsReliableandScalableRobotCommunication 6 2016/7/29 thisfacemustbedetectedaccuratelytobestassesstheimpactof Table3. Scalability:MemoryConsumptionat50Agents failuresonfacequality. ROSsystem Erlangsystem • minSize.Tokeepthequalityofdetectedfaceshigh,theymust Pythonmainnode: ErlangVM: beatleast50x50pixels.Smallerfaceshavelesssharpfeatures ∼64MBx1instance ∼200MBx1instance (especiallywiththecamera’s640x480resolution).Thussmaller Pythonworkernodes: Pythoninterpreters: facesarehardertodetect. ∼50MBx50instances ∼22MBx50instances Total:2,564MB Total:1,300MB • Publishingrateof5Hzisslowerthantheratecommonlyused (10 Hz)inROStutorialsandexampleprogramsaswewantto ensurehigherdatacompleteness,i.e.allframesarrivingattheir theErlangprogramscalesupto700workerprocesses.Inthecase destinationnode. oftheROSsystem,thesefailuresareduetounspecifiedXorg(a • Queuesizeiskeptsmall(5),sincenomessagedropsoranyother displayserverapplicationforUbuntudistribution)errors,whilethe performanceimpactswereobserved.Asmallerqueuewould primarycauseofthefailuresintheErlangapplicationisreaching presumablyalsouselessmemory. theOS’maximumopensocketlimit.Theattemptstoincreasethis • Supervisor shutdown interval of 20ms allows for any non- limithavenotresolvedtheissue,soweattributetheerrortopoor responsiveprocesstobeconsideredashangedandthatneeds socketrecyclingonthepartoftheErlPortlibrary(wefurtherdiscuss a restarting protocol. We set this value to maintain real-time thisinSection5). constrains. Memorymayalsolimitthescalabilityofsystems.Toanalyse this,wecomparememoryconsumptionofROSandErlangface • Supervisorrestartstrategyone_for_onemirrorsROSrestart- recognition systems with 50 workers using measurements taken ingbehaviour,i.e.ROSonlyrestartsthefailedprocesses. fromUbuntu’sinbuilttaskmanagerandthetopcommand.Table3 • Supervisormaximumrestartintensity.Thedefaultvaluesforthe showsthatboththemainandtheworkerROSnodesarerelatively max_restartandmax_intervalvaluesare1and5,respec- resource-intensive,consuming65MBand50MBpernoderespec- tively;whichmeansamaximumof1restartwithinaspanof5 tively.TheexactresourceconsumptionofeachindividualErlang seconds.Thisprovedunsuitableforthepurposeofthereliabil- processcouldnotbeaccuratelydetermined,duetoabstractionre- ityexperiments,wheretheChaosMonkeyprogramwouldkill sultingfromtheself-containednatureoftheErlangvirtualmachine. processesmuchfasterthanthat.Thus,throughexperimentation, However,assumingequalmemorydistribution,itcanbeextrapo- a1000-in-1-secondvaluewaschosentobecompatiblewith latedthattheErlangmiddlewareconsumesapproximately200MB theChaosMonkeyfailurerate. per50processes=4MBperprocess.Ifcombinedwiththe22MB consumedbyeachPythoninstance,thetotalconsumptionofan Both ROS and Erlang face trackers are open source and can Erlangprocessingunitamountsto26MB,approximatelyhalfofthe be downloaded from https://github.com/AndreeaLutac/ correspondingROSvalue. Erlang-Camera-Control-Project. FromthisexperimentweconcludethatErlangcommunication providesatleast3.5timeslargerscalability(200ROSnodesagainst 4.1 Scalability 700 Erlang processes) while consuming half of the memory as Thisexperimentdeterminesthescalabilityofthetwosystemsby comparedtoasimilarROSimplementation. measuring the maximum number of face detection workers, i.e. themaximumnumberofROSnodesorErlangprocessesthatthe 4.2 MessageLatency correspondingprogramscansupportwhilefunctioningcorrectly, Theaimofthisexperimentistoquantifythesystems’performance withoutaseverestrainontheoperatingsystemandwithnofailures. as the number of nodes and processes increases. This is done Theexperimentproceedsbyincreasingthenumberofworkers(ROS by measuring the time each message spends in transit from one nodesorErlangprocesses)untiltheprogramsstartfailing. stage of the processing pipeline to another. In the case of ROS, Figure4plotsprocessingtime(i.e.periodbetweenobtaininga the verified latencies are the ones incurred by message passing framefromthecameraanddeliveringtheresulttothecorresponding betweenthecameraandeachfacedetectionnode,andbetweenthe faceoutput)againstthenumberofworkers.Thisisweakscaling,i.e. facedetectionnodeandtheaggregatorfunction.Inthecaseofthe moreworkersmeansthatthereismoreworktobedone.Thefigure Erlangapplication,measurementsaremadeofthedelaysbetween showsthattheROSprogramscalesupto200workernodes,while thePythoncameraaccessorsandtheErlangfacedetectionprocesses, betweentheErlangfacedetectionmodulesandthePythondetector, andbetweenthePythondetectorandtheErlangaggregatorprocess. Additionally, the mean time taken for the OpenCV detection to 120 executeismeasuredforeachprogram.Wethensequentiallystart 100 instances of the ROS and Erlang programs running increasingly ms) more workers and processes in each run. For each program, we me ( 80 obtainamessagelatencymeasurebytakingthedifferencesbetween Processing ti 4600 tmheeaFrseiugcrouerrsde.e5dtsihmoewsstatmhepsc,aalncdultahteendtalakteenthceiemsefaonroRfOthSea7n0d0lEartelanncgy programs. On a small scale of up to 9 nodes the total message 20 latencyisidenticalinbothROSandErlangsystems,andis∼37 milliseconds(redlineinFigure5).However,startingat9workers 0 thetimeincreasessharplyfortheErlangsystem,whiletheROS 0 100 200 300 400 500 600 700 Number of workers systemdemonstratesconstanttimeuntilitreachesitsmaximum ROS Erlang limitof200workers. The delay induced by the OpenCV face detection operation Figure4. Scalability:MaximumNumberofWorkers (magentalineinFigure5)isverysimilarinbothROSandErlang TowardsReliableandScalableRobotCommunication 7 2016/7/29 120 120 Processing time (ms) 1 46800000 Processing time (ms) 1 04680000 20 20 0 0 0 100 200 300 400 500 600 700 0 100 200 300 400 500 600 700 Number of workers Number of workers Total Python detec to Erl aggr Total Worker node to Aggregator Camera to Erl process Detection Main node to Worker node Detection Erl proc to Python detec (a) ROS (b) Erlang Figure5. FaceTrackingProcessingTime programs,thusitdoesnotaccountforanyperformancedifferences. Wethenclassifyeachindexbyitscorrespondingframe,program The detection functions perform similarly, with slightly higher version(ROSorErlang),andworkernodecount.Toestablishthe ROSvaluesbeingattributedtoanadditionalstepthedetectormust differencesbetweenthetwoimplementationsinthecontextofface performtoconverttheframefromanImage-typetopicmessageto quality, we average the indices and compare the resulting mean ausablepixelmatrix. data.Todistributeframestodetectorsweuseabroadcastapproach Furtheranalysisindicatesthatthespikecorrelatestothefunction discussedinSection3. call from Erlang detector processes to the Python module (blue FromTable4,weconcludethatthereisnosignificantdifference lineinFigure5(b)).Whileapossibleexplanationforthisanomaly between the quality indices resulting from ROS and Erlang pro- couldrelatetothesizeofthemessages(largepixelarrays),thesame grams.However,commontothesystemsaggregatordequeuelength behaviourisnotobservedinthecameratoErlangprocessrelay, variablewasobservedtohaveasignificantimpactonfacequality whichdealswithsimilarmessagesizes,andisalsoperformedvia whenfacesarerandomlydistributedamongstdetectors.Thisisdue I/Osocket.Therefore,weattributethediscrepancytofaultsinthe tothefactthatthelargerthedequeuesize,themoreout-datedfaces behaviourofErlPort’ssocketinterfaces. getaggregatedinthefinalfacereading.Theresultsalsoillustrate While ErlPort is a convenient method of linking Python and thatifalloutputfacesareconsideredindividually(atdequeuesize1 Erlang,itslimitedscalabilitysignificantlylimitsscalabilityofthe andhencewithnoaggregation)theresultingJ isveryclosetothe Q Erlang middleware. From this we conclude that Erlang is able expectedgroundtruthindexJ =1,whileifaggregationis Qactual to offer processing times which are at least as good as the ROS performed,thevaluesofJ dropswitheachincreaseindequeue Q analogue, but great care should be taken in selecting the cross- size. languageorcross-platformcommunicationlibraries. Therefore,weconcludethatErlang’sabilityasamiddleware tosupportapplicationfunctionsisnothinderedbygrowthinthe 4.3 FaceQualityVariationswithReal-timeAggregation numberofworkerprocesses. Inthisexperimentweassessthequalityoffacedetectionproduced 4.4 Reliability bymulti-agentinstancesoftheROSandErlangsystems,investigat- inganimpactofscalingthenumberofworkersontheperformance WeinvestigatethereliabilityoftheROSandErlangfacetrackers ofthetwoprograms. by terminating face-detecting workers and aggregator, and then For that we create a set of 100 (image, file) pairs, where the measurethelatencybetweentheirshutdownandrestart.Termination images are snapshots taken by the camera, and the files contain is done both by invoking the agents’ exit procedures, through a the coordinates of the primary face present in the image, deter- standard termination signal, and also abruptly, by immediately minedthroughOpenCVdetectionwiththesamehaarcascade_ orderingthekerneltoterminatetheagents. _frontalface_1classifier.Inaddition,wereplacethereal-time Toterminate“living”agentsfromtheirrespectivefacetracking camerafeedoftheROSandErlangprogramswiththepreparedset systems (ROS and Erlang) we have developed a pair of scripts of100imagesandruninstancesofeachprogramwithagentnum- inPythonandErlang.ThesescriptsaremodelledaftertheChaos bersincreasingasfollows:1,100,200nodes/processes.Foreach Monkey[27]reliabilitytestingserviceintroducedbyNetflix.On run,werecord100facepositionresultsand,foreachtestimage, bothROS andErlang systemsrunning100 workers,we instruct wecomparetheaccuracyofthefacepositionagainstamanually the chaos monkey program to execute 200 random terminations annotatedgroundtruth.Thelatterisperformedbyametricderived every1second.Wethencomputelatenciesforeachsystembased fromtheJaccardindex[15],thatcomputesthesimilarityoftwo samplesets.Thisvariantmetric,J ,replacesthesetswithrectangle Q areas: Table4. MeanIndicesofFaceQuality J (A ,A )= Aov , J ∈[0,1], (1) DequeueLength Q det act A +A −A Q 10 5 1 det act ov ROS 0.285 0.591 0.947 whereA istheareaoftherectanglethatdescribesthepredicted det Erlang 0.302 0.608 0.922 faceposition,A istheareaofrectanglethatdenotesthedefini- act Difference -5.62% -2.79% 2.71% tivelyestablishedfacelocation,andA istheareaoftheiroverlap. ov TowardsReliableandScalableRobotCommunication 8 2016/7/29 Table5. Reliability:ROSvs.ErlangRecoveryTime doesnotkeeptrackofundeliveredmessagestotopicsubscribers, Processtype Typeofshutdown meaningthatallmessagesthatdonothaveasubscriberareper- SIGTERM SIGKILL manentlylost.ThischaracteristicisobservedinFigures6–10asa suddenandsteepdecreaseinfacedetectionaccuracyimmediately ROS aftertheterminationsoccur,followedbyfluctuationsinthequality Mainnode 420ms 400ms percentage.Theselatterfluctuationsarebelievedtobecausedpri- Workernode 369ms 291ms marilybytheaggregatorreceivinganunexpectedfacereadingfor Erlang aframef andaggregatingitwithreadingsassociatedwith“older” Supervisorprocess 0.368ms 0.352ms i frames,f ,f ,etc.,thusskippingovertheklostframesand Workerandaggregatorprocess 0.224ms 0.249ms i−k i−k−1 resultinginafinalaggregationproductthat“lags”behindtheground truthreading. Asthesequenceofframesrelayedthroughthedetectorsnears ondifferencesbetweentherecordedtimestamps,andthentakethe itsend,theredandblueseriessinksharplytoa0%flatlineFor averageover500latencyreadings.Wefinallycomparetheresulting eachofthen-failuretests,thisdropoccursincreasinglysoonerinto meandatatoestablishwhichofthetwosystemsisfastertorestart theimagefeedandisconsideredtobeindicativeofthenumberof itsfailedagents. faceslostasaresultofprogressivelymorenodesbeingshutdown. TherestartlatencypresentedinTable5showsthatROStakes Throughouttheexecutionofthetests,theROSsystemwasobserved between291msand420mstorestartafailednode,whereasErlang toroutinelybeunabletorestartallofitsterminatedworkersbefore restartsitsprocessesin0.224ms–0.368ms,whichisapproximately thelastimageistransmitted. 0.06%to0.08%ofthetimetakenbyROS.Thesefindingsdemon- Startingwiththe50-failuresub-test,theErlangprogramalso stratethatErlangissignificantlymoreefficientthanROSatrestart- exhibitsdropsinfacequality,followingsimilarpatternstotheones ingprocessesthatfailrandomlyandunexpectedly(0.352msand described above for the ROS detectors. However, the impact is 0.249msagainst400msand291msrespectively).Thesignificance relativelysmallsinceErlangworkerprocessesarelargelyableto ofthisanalysisinthecontextofroboticsisthatsystemuptimefor recoverbeforetheendoftheimagesequence.Thus,byvirtueofits robotsrunninganErlangsupervisionmechanismhasthepotential sub-millisecondrestartdelays,theErlangfacetrackingsystemis tobesignificantlyhighercomparedtoROS-basedrobots. alwaysabletofinishitsoperationwithall100processesalive. 4.5 FaceQualityVariationswithWorkerFailures The last aspect to be addressed in the analysis of the experi- ment’sfindingsrelatestothesharpqualitydeclinebetweenframes Inthefinalexperimentweanalysetheimpactthatrandompartial 73and75ofmostn-failuretests.Theprecipitousdecreaseishy- failureshaveontheaccuracyofthefacedetectionprocessexecuted pothesisedtobecausedbyananomalyintheOpenCVdetection bythetwofacetrackingsystems.Forthat,weuseasetof100images process,whereby for ashort periodthe “real-time”detectors do createdfortheexperimentinSection4.3thatcontainpre-computed notdetectfacesinanarrayofconsecutiveframes,eventhoughthe facestoresembleasimulated“real-time”imagefeedwhilerunning “static”(groundtruth)detectordoes.Asitscauseshaveyettobe 100workers.Wetheninstructthechaosmonkeyscriptcreatedfor unequivocallydetermined,wemayfurtherexaminetheissueinthe theexperimentsinSection4.4tograduallyterminate10,25,50,75 future. and90oftheface-detectingworkers,withnopausebetweenthe FromthisexperimentweconcludethatErlangiscapablenot terminations.Foreachrunandeachtestimage,wecomparethe onlytoreducerobotcomponentdowntime,butalsomitigateneg- accuracyofthefacepositionsagainstgroundtruthusingtheJaccard ativeimpactofthesefailures.Thesefindingsbecomeparticularly similaritycoefficient(1).Wethencomparethevariationsinface compelling when considering the fact that many modern robots qualityovertimeforeachrunofthetwoframeworks. operateinsafety-criticalenvironmentsandanyperiodsofunstable Similarly to the results in Section 4.4, we observed that the behaviourcanresultinhazardousfunctioning. Erlangprogramismorereliableintermsofthefacedetection,even while experiencing sudden partial failures. Figures 6–10 show a percentage-basedrepresentationofqualityvariations,astheyare 5. Conclusion observedduringeachphaseofthetesting(i.e.with10,25,50,75, and90failedworkers).Werefertothesephasesasthen-failure We investigate whether Erlang has the potential to address the sub-tests,wheren ∈[10,25,50,75,90]. scalabilityandreliabilityissueswithcommunicationbetweenROS The data series represented by the green line is the quality nodesinroboticsystems.Thebasisoftheinvestigationisapairof baselinequantifiedbythepre-determinedfacecoordinatesforeach simpleface-trackingapplications:oneusingROSmessagerelayvia ofthe100images.Theredandbluelinesrepresentexperimentalface topics,andtheotherabespokemulti-processErlangframeworkto qualityregisteredafterthepointofmassagenttermination(normal transmitmessagesbetweenfunctionalunitsthatresideoutsideof andabrupt,respectively).Thepointofterminationisdepictedas theErlangVM.BothapplicationsinterfacewithanexternalPTZ thepurpledottedline,whichwehavetriedtokeepasconsistentas cameratocaptureframesinrealtimeandrelyonPython-basedHaar possiblewithinthecontextofeachsub-test. cascadeclassificationtoperformdistributedhumanfacedetection Inthecaseofthe10-failuretests(Figure6)andthe25-failure ontheseimages. tests (Figure 7), the Erlang application indicates no face quality We conduct five experiments to compare key aspects of the penalties, which is due to the worker processes being restarted ROS and Erlang face trackers (Section 4). We find that Erlang beforethecamerahasachancetocastanyframestofailedworkers. communication scales better, supporting at least 3.5 times more Thegraphlinesresultingfromtheseobservationsthuscoincides active processes (700 processes) (Figure 4) than its ROS-based entirelywiththe“groundtruth”line.This,however,doesnothold counterpart (200 nodes) while consuming half of the memory fortheROSprogram,whereevenforarelativelysmallnumberof (Table 3). However, while both face tracking prototypes exhibit terminated nodes, the high latency of node restarts causes some similardetectionaccuracyandtransmissionlatencieswhenkept framestobepublishedtotopicsthatarenolongerconnectedtotheir under10workers,Erlangexhibitsacontinuousincreaseinthetotal dedicatedworkers. timetakentoprocessaframeasmoreagentsareadded(Figure5). Unlikeotherlog-andtopic-basedmessagepassingframeworks, Sincethecauseofthisphenomenonisthoughttoliewiththeexternal suchaswidelyusedintheareaofbigdataApacheKafka[8],ROS VM linking library, the implication is that a highly concurrent TowardsReliableandScalableRobotCommunication 9 2016/7/29 (a) ROS (b) Erlang Figure6. QualityVariationwith10%FailedWorkers (a) ROS (b) Erlang Figure7. QualityVariationwith25%FailedWorkers (a) ROS (b) Erlang Figure8. QualityVariationwith50%FailedWorkers (a) ROS (b) Erlang Figure9. QualityVariationwith75FailedWorkers TowardsReliableandScalableRobotCommunication 10 2016/7/29
Description: