Table Of Content

Reducing Response-Time Bounds for DAG-Based Task Systems on Heterogeneous Multicore Platforms ∗ KechengYang,MingYang,andJamesH.Anderson DepartmentofComputerScience,UniversityofNorthCarolinaatChapelHill Abstract donebycellularbasestationsinwirelessnetworks.Dueto spaceconstraints,werefrainfromdelvingintospecificsre- Thispaperconsidersforthefirsttimeend-to-endresponse- garding this particular application domain, opting instead time analysis for DAG-based real-time task systems im- foramoreabstracttreatmentoftheproblemathand. plemented on heterogeneous multicore platforms. The spe- This problem involves the scheduling of real-time data- cific analysis problem that is considered was motivated by flowsonheterogeneouscomputationalelements(CEs),such anindustrialcollaborationinvolvingwirelesscellularbase as CPUs, digital signal processors (DSPs), or one of many stations. The DAG-based systems considered herein allow typesofhardwareaccelerators.Eachdataflowisrepresented intra-taskparallelism:whileeachinvocationofatask(i.e., byaDAG,thenodes(resp.,edges)ofwhichrepresenttasks DAG node) is sequential, successive invocations of a task (resp.,producer/consumerrelationships).Agiventaskisre- mayexecuteinparallel.Intheproposedanalysis,thischar- strictedtorunonaspecificCEtype.Taskpreemptionmay acteristicisexploitedtoreduceresponse-timebounds.Ad- beimpossibleforsomeCEsandshouldinanycasebedis- ditionally,thereissomeleewayinchoosinghowtosettasks’ couraged. Each DAG has a single source task that is in- relativedeadlines.Itisshownthatbyresolvingsuchchoices voked periodically. Intra-task parallelism is allowed in the holistically via linear programming, response-time bounds sense that consecutive jobs (i.e., invocations) of the same canbefurtherreduced.Finally,intheconsideredusecase, task can execute in parallel (but each job executes sequen- DAGsaredefinedbaseduponjustafewtemplatesandindi- tially).Infact,alaterjobcanfinishearlierduetovariations viduallyoftenhavequitelowutilizations.Itisshownthat,by in running times.1 The DAGs to be supported are defined combining many such DAGs into one of higher utilization, using a relatively small number of “templates,” i.e., many response-timeboundscanoftenbedrasticallylowered.The DAGs may exist that are structurally identical. The chal- effectiveness of these techniques is demonstrated via both lenge is to devise a multi-resource, real-time scheduler for case-studyandschedulabilityexperiments. supportingdataflowsasdescribedherewithaccompanying per-dataflow end-to-end response-time analysis. Note that, 1 Introduction althoughresponse-timeboundsarerequired(andshouldnot The multicore revolution is currently undergoing a second betoolarge),strictdeadlinesarenotenforced. waveofinnovationintheformofheterogeneoushardware Relatedwork. Theliteratureonreal-timesystemsincludes platforms. In the domain of real-time embedded systems, muchworkpertainingtotheschedulingofDAG-basedtask suchplatformsmaybedesirabletouseforavarietyofrea- systemsonidenticalmultiprocessorplatforms[1,3,4,5,6, sons.Forexample,ARM’sbig.LITTLEmulticorearchitec- 8,9,11,13,15,16,17,18,19,20,21,22].However,weare ture[7]enablesperformanceandenergyconcernstobebal- notawareofanycorrespondingworkdirectedatheteroge- anced by providing a mix of relatively slower, low-power neousplatforms.Moreover,theproblemabovehasfacets— coresandfaster,high-powerones.Unfortunately,themove such as allowing intra-task parallelism and defining poten- towards greater heterogeneity is further complicating soft- tially many DAGs based upon relatively few templates— ware design processes that were already being challenged that have not been fully explored. (Intra-task parallelism onaccountofthesignificantparallelismthatexistsin“con- has been considered in a limited context in DAG-related ventional” multicore platforms with identical processors. workpertainingtoidenticalmultiprocessors[6,8,21].)Ad- Suchcomplicationsareimpedingadvancementsintheem- ditionally,inpriorworkonthereal-timeschedulingoftask beddedcomputingindustrytoday. systems—whicharenotDAG-based—uponheterogeneous Problem considered herein. In this paper, we report on platforms, task-to-CE assignments have been a paramount our efforts towards solving a particular real-time analysis concern. In our setting, this is a non-issue, as this assign- problem concerning heterogeneity motivated by an indus- mentispre-determinedbasedonCEfunctionalities. trialcollaboration.Thisproblempertainstotheprocessing Beyond the real-time systems community, DAG-based systems implemented on heterogeneous platforms have ∗WorksupportedbyNSFgrantsCPS1239135,CNS1409175,andCPS 1446631, AFOSR grant FA9550-14-1-0161, ARO grant W911NF-14-1- 1Intheconsideredapplicationdomain,thedataproducedbysubsequent 0499,andfundingfrombothGeneralMotorsandFutureWeiCorp. jobscanbebuffereduntilallpriorjobsofthesametaskhavecompleted. 1 beenconsideredbefore(e.g.,[2,14,24]).However,allsuch 2 SystemModel work known to us focuses on one-shot, aperiodic DAG- Inthissection,weformalizethedataflow-schedulingprob- basedjobs,ratherthanperiodicorsporadicDAG-basedtask lemdescribedinSec.1andintroducerelevantterminology. systems. Moreover, real-time issues are considered only EachdataflowisrepresentedbyaDAG,asdiscussedearlier. obliquelyfromtheperspectivesofjobadmissioncontrolor We specifically consider a system G = {G ,G , jobmakespanminimization. 1 2 ...,G } comprised of N DAGs. The DAG G con- N i Contributions. Inthispaper,weformalizetheproblemde- sists of n nodes, which correspond to n tasks, de- i i scribedaboveandthenaddressitbyproposingascheduling noted τ1,τ2,...,τni. Each task τv releases a (po- approachandassociatedend-to-endresponse-timeanalysis. tentiallyiinfiinite) siequence of jobs iJv , Jv ,.... The Inthefirstpartofthepaper,weattacktheproblembypre- i,1 i,2 edges in G reflect producer/consumer relationships. A i senting a transformation process whereby successive task particular task τv’s producers are those tasks with models are introduced such that: (i) the first task model outgoing edges idirected to τv, and its consumers directly formalizes the problem above; (ii) prior analysis are those with incoming edgies directed from τv. can be applied to the last model to obtain response-time Thejth joboftaskτv,Jv ,can- i boundsunderearliest-deadline-first(EDF)scheduling;and not commence execiutioi,nj until 1 (iii)eachsuccessivemodelisarefinementofthepriorone the jth jobs of all of its produc- 1 inthesensethatallDAG-basedprecedenceconstraintsare ershavecompleted;thisensures preserved. Such a transformation approach was previously that its necessary input data is 2 3 usedbyLiuandAnderson[18]inworkonDAG-basedsys- 1 1 available. Such job dependen- tems, but that work focused on identical multiprocessors. cies only exist with respect to Moreover, our work differs from theirs in that we allow the same invocation of a DAG, 4 intra-task parallelism. This enables much smaller end-to- 1 and not across different invoca- endresponse-timeboundstobederived. tions. That is, while jobs must Afterpresentingthistransformationprocess,wediscuss execute sequentially, intra-task Figure1:ADAGG . two techniques that can reduce the response-time bounds 1 parallelismisallowed. enabledbythisprocess.Thefirsttechniqueexploitsthefact Example1. Fig.1showsanexampleDAG,G .Taskτ4’s thatsomeleewayexistsinsettingtasks’relativedeadlines. 1 1 producers are tasks τ2 and τ3, thus for any j, J4 needs Bysettingmoreaggressivedeadlinesfortasksalong“long” 1 1 1,j pathsinaDAG,theoverallend-to-endresponse-timebound inputdatafromeachofJ12,j andJ13,j,soitmustwaituntil of that DAG can be reduced. We show that such deadline those jobs complete. Because intra-task parallelism is al- adjustmentscanbemadebysolvingalinearprogram. lowed,J14,j andJ14,j+1couldpotentiallyexecuteinparallel. The second technique exploits the fact that, in the con- To simplify analysis, we assume that each DAG G has i sideredcontext,DAGsaredefinedusingrelativelyfewtem- exactlyonesourcetaskτ1,whichhasonlyoutgoingedges, i platesandtypicallyhavequitelowutilizations.Thesefacts and one sink task τni, which has only incoming edges. i enable us to reduce response-time bounds by combining Multi-source/multi-sink DAGs can be supported with the manyDAGsintooneoflargerutilization.Asaverysimple additionofsingular“virtual”sourcesandsinksthatconnect example,twoDAGswithaperiodof10timeunitsmightbe multiplesourcesandsinks,respectively.Virtualsourcesand combinedintoonewithaperiodof5timeunits.Aresponse- sinkshaveaworst-caseexecutiontime(WCET)ofzero. time-boundreductionisenabledbecausetheseboundstend We consider the scheduling of DAGs as just described tobeproportionaltoperiods.Intheconsideredapplication onaheterogeneoushardwareplatformconsistingofdiffer- domain,theextentofcombiningcanbemuchmoreexten- ent types of CEs. A given CE might be a CPU, DSP, or sive:upwardsof40DAGsmaybecombinable. somespecializedhardwareaccelerator(HAC).TheCEsare As a final contribution, we evaluate our proposed organizedinM CEpools,whereeachCEpoolπ consists k techniques via case-study and schedulability experiments. ofm identicalCEs.Eachtaskτv hasaparameterPv that k i i These experiments show that our techniques can signifi- denotes the particular CE pool on which it must run, i.e., cantlyreduceresponse-timebounds.Furthermore,ouranal- Pv =π meansthateachjobofτv mustbescheduledona i k i ysis supports “early releasing” [10] (see Sec. 7) to im- CEintheCEpoolπ .TheWCEToftaskτvisdenotedCv. k i i prove observed end-to-end response times. We experimen- AlthoughtheproblemdescriptioninSec.1indicatedthat tallydemonstratetheefficacyofthisaswell. sourcetasksarereleasedperiodically,wegeneralizethisto Organization. In the rest of this paper, we formalize the allow sporadic releases, i.e., for the DAG G , the job re- i considered problem (Sec. 2), present the refinements men- leases of τ1 have a minimum separation time, denoted T . i i tionedabovethatenabletheuseofprioranalysis(Secs.3– Anon-sourcetaskτv (v > 1)releasesitsjth jobJv when i i,j 4), show that the bounds arising from this analysis can be thejth jobsofallitsproducertasksinG havecompleted. i improved via linear programming (Sec. 5) and DAG com- That is, letting av and fv denote the release (or arrival) i,j i,j bining(Sec.6),discussearlyreleasing(Sec.7),presentour andfinishtimesofJv ,respectively, i,j case-study(Sec.8)andschedulability(Sec.9)experiments, andconclude(Sec.10). av =max{fw |τw isaproducerofτv}. (1) i,j i,j i i 2 J11,1 J11,2 WealsouseΓ todenotethesetoftasksthatarerequiredto τ1 k 1 executeontheCEpoolπ ,i.e., k J12,1 J12,2 Γ ={τv |Pv =π }. (4) τ2 k i i k 1 The overutilization of a CE pool could cause unbounded J13,1 J13,2 responsetimes,sowerequireforeachk, τ3 1 (cid:88) uv ≤m . (5) J14,2 i k τiv∈Γk τ4 J14,1 1 3 Offset-BasedIndependentTasks end-to-end response time end-to-end response time In this section, we present a second task model, which as Time shown in Sec. 4 can be viewed as a refinement of that just 0 5 10 15 20 presented. The prior model is somewhat problematic be- Job Release Job Deadline Job Completion CPU Execution DSP Execution cause of difficult-to-analyze dependencies among jobs. In (Assume depicted jobs are scheduled alongside other jobs, which are not shown.) particular, by (1), the release times of jobs of non-source tasksdependonthefinishtimesofotherjobs,andhenceon Figure2:ExampleschedulefortheDAGinG inFig.1. their execution times. By (2), deadlines (and hence priori- 1 ties)ofjobsareaffectedbysimilardependencies. Inordertoeaseanalysisdifficultiesassociatedwithsuch The response time of job Jv is definedas fv −av , and i,j i,j i,j jobdependencies,weintroduceheretheoffset-basedinde- theend-to-endresponsetimeoftheDAGG asfni −a1 . i i,j i,j pendent task (obi-task) model. Under this model, tasks are Example 2. Fig. 2 depicts an example schedule for the partitionedintogroups.Theithsuchgroupconsistsoftasks DAGG1 inFig.1,assumingtaskτ12 isrequiredtoexecute denoted τi1,τi2,...,τini, where τi1 is a designated source on a DSP and the other tasks are required to execute on a task that releases jobs sporadically with a minimum sepa- CPU.Thefirst(resp.,second)jobofeachtaskhasalighter rationofTi.Thatis,foranypositiveintegerj, (resp., darker) shading to make them easier to distinguish. a1 −a1 ≥T . (6) Tasksτ2 andτ3 haveonlyoneproducer,τ1,sowhentask i,j+1 i,j i 1 1 1 τ1finishesajobattimes3and8,τ2andτ3releaseajobim- 1 1 1 Job releases of each non-source task τv are governed by a mediately.Incontrast,taskτ4hastwoproducers,τ2andτ3. i 1 1 1 new parameter Φv, called the offset of τv. Specifically, τv Therefore,τ4cannotreleaseajobattime5whenτ3finishes i i i 1 1 releases its jth job exactly Φv time units after the release ajob,butrathermustwaituntiltime6whenτ2alsofinishes i 1 timeofthejthjobofthesourcetaskτ1ofitsgroup.Thatis, ajob.Notethatconsecutivejobsofthesametaskmightexe- i cuteinparallel(e.g.,J4 andJ4 executeinparallelduring av =a1 +Φv. (7) 1,1 1,2 i,j i,j i [11,12)).Furthermore,foragiventask,alater-releasedjob (e.g., J4 ) may even finish earlier than an earlier-released Forconsistency,wedefine 1,2 one(e.g.,J4 )duetoexecution-timevariations. Φ1 =0. (8) 1,1 i Scheduling. SincemanyCEsarenon-preemptible,weuse Undertheobi-taskmodel,ajobofataskτv canbesched- i thenon-preemptiveglobalEDF(G-EDF)schedulingalgo- uledatanytimeafteritsreleaseindependentlyoftheexecu- rithmwithineachCEpool.ThedeadlineofjobJv isgiven tionofanyotherjobs,evenjobsofthesametaskτv. i,j i by Thedefinitionssofarhavedealtwithjobreleases.Addi- dv =av +Dv, (2) tionally,thetwoper-taskparametersCvandPvfromSec.2 i,j i,j i i i areretainedwiththesamedefinitions. whereDvistherelativedeadlineoftaskτv.Forexample,in i i Thefollowingpropertyshowsthateveryobi-taskτv has theexamplescheduleinFig.2,relativedeadlinesofD1 = i 1 aminimumjob-releaseseparationofT . 7,D2 =4,D3 =6,andD4 =8areassumed. i 1 1 1 Property1. Foranyobi-taskτv,av −av ≥T . In the context of this paper, deadlines mainly serve the i i,j+1 i,j i purpose of determining jobs’ priorities, rather than strict Proof. timing constraints for individual jobs. Therefore, deadline av −av = {by(7)} i,j+1 i,j misses are acceptable as long as the end-to-end response (a1 +Φv)−(a1 +Φv) timeofeachDAGcanbereasonablybounded. i,j+1 i i,j i ≥ {by(6)} Utilization. Wedenotetheutilizationoftaskτv by i T i Cv uv = i . (3) i T i 3 4 Response-TimeBounds instantiation of the npc-sporadic task τv = (Cv,T ,Dv). i i i i Hence,anyinstantiationofanobi-taskset{τv |Pv = π } In this section, we establish two results that enable prior i i k is an instantiation of the npc-sporadic task set {τv | Pv = worktobeleveragedtoestablishresponse-timeboundsfor i i π }.Also,sinceobi-tasksexecuteindependentlyofonean- DAG-basedtasksystems.First,weshowthat,undertheobi- k other, obi-tasks executing in different CE pools cannot af- taskmodelwitharbitraryoffsetsettings,per-taskresponse- fecteachother.SinceeachCEpoolπ hasm identicalpro- time bounds can be derived by exploiting prior work per- k k cessors,theproblemwemustconsideristhatofscheduling taining to a task model called the npc-sporadic task model aninstantiationofthenpc-sporadictaskset{τv|Pv =π } (“npc” stands for “no precedence constraints”—this refers i i k on m identical processors. Since Theorem 1 applies to a tothelackofprecedenceconstraintsamongjobsofthesame k non-concretenpc-sporadictaskset,itappliestoeverycon- task) [12, 25]. Second, we show that, by properly setting creteinstantiationofsuchataskset.Thus,wehavethefol- offsets,any DAG-basedtask systemcanbe transformedto lowingresponse-timeboundforeachobi-taskτv: acorrespondingobi-tasksystem. i   4.1 Response-TimeBoundsforObi-Tasks 1 (cid:88) (cid:88) Riv =m Div· uwl + (uwl ·max{0,Tl−Dlw}) Annpc-sporadictaskτi isspecifiedby(Ci,Ti,Di),where k τlw∈Γk τlw∈Γk C isitsWCET,T istheminimumseparationtimebetween i i m −1 consecutivejobreleasesofτ ,andD isitsrelativedeadline. + max {Cw}+ k Cv, (9) Asbefore,τ ’sutilizationisiu =Ci/T . τlw∈Γk l mk i i i i i The main difference between the conventional sporadic whereΓ ={τw |Pw =π }. taskmodelandthenpc-sporadictaskmodelisthatthefor- k l l k Notethat(9)isapplicableaslongasallrelativedeadlines mer requires successive jobs of each task to execute in se- arenon-negative,andappliestoanyarbitraryoffsetsetting. quence while the latter allows them to execute in parallel. That is, under the conventional sporadic task model, 4.2 FromDAG-BasedTaskSetstoObi-TaskSets jobJ cannotcommenceexecutionuntilitspredecessor i,j+1 J completes, even if a , the release time of J , We now show that, by properly setting offsets, any DAG- i,j i,j+1 i,j+1 haselapsed.Incontrast,underthenpc-sporadictaskmodel, based task set can be transformed to an obi-task set, and anyjobcanexecuteassoonasitisreleased.Notethat,al- per-DAG end-to-end response-time bounds can be derived thoughweallowintra-taskparallelism,eachindividualjob byleveragingtheobi-taskresponse-timeboundsjuststated. stillmustexecutesequentially. AnyDAG-basedtasksetcanbeimplementedbyanobi- Yang and Anderson [25] investigated the G-EDF tasksetinanobviousway:eachDAGbecomesanobi-task schedulingofnpc-sporadictasksonuniformheterogeneous groupwiththesametaskdesignatedasitssource,andallTi, multiprocessor platforms where different processors may Civ, and Piv parameters are retained without modification. havedifferentspeeds.Bysettingeachprocessor’sspeedto Whatislessobviousishowtodefinetaskoffsetsunderthe be1.0,thefollowingtheoremfollowsfromtheirwork. obi-task model. This is done by setting each Φvi (v (cid:54)= 1) parametertobeaconstantsuchthat Theorem 1. (Follows from Theorem 4 in [25]) Consider the scheduling of a set of npc-sporadic tasks τ on m iden- Φv ≥ max {Φk+Rk}, (10) i i i ticalmultiprocessors.Undernon-preemptiveG-EDF,each τk∈prod(τv) i i npc-taskτ ∈τ hasthefollowingresponse-timebound,pro- (cid:80) i where prod(τv) denotes the set of obi-tasks corresponding vided u ≤m. i τl∈τ l totheDAG-basedtasksthataretheproducersoftheDAG- basedtaskτv inG ,andRk denotesaresponse-timebound (cid:32) (cid:18) (cid:19)(cid:33) i i i 1 (cid:88) (cid:88) fortheobi-taskτk.Fornow,weassumethatRk isknown, D · u + u ·max{0,T −D } i i m i τl∈τ l τl∈τ l τl∈τ l l butlater,wewillshowhowtocomputeit. m−1 Example3. ConsideragaintheDAGG1inFig.1.Assume +mτl∈aτx{Cl}+ m Ci. thhaavte, arfetseproanpspel-ytiimngetbhoeuanbdosvoeftrRa1ns=for9m,aRtio2n=, th5e,oRb3i-t=ask7s, 1 1 1 We now show that Theorem 1 can be applied to obtain andR1 =8,respectively.Then,wecansetΦ1 =0,Φ2 =9, 1 1 1 per-taskresponse-timeboundsforanyobi-taskset. Φ3 = 9,andΦ4 = 16,respectively,andsatisfy(10).With 1 1 these response-time bounds, the end-to-end response-time Concretevs.non-concrete. Aconcretesequenceofjobre- boundthatcanbeguaranteedisdeterminedbyR1,R3,and leases that satisfies a task’s specification (under either the 1 1 R1andisgivenbyR =24.Fig.3depictsapossiblesched- obi-ornpc-sporadictaskmodel)iscalledaninstantiationof 1 1 ule for these obi-tasks and illustrates the transformation. thattask.Aninstantiationofataskset isdefinedsimilarly. LikeinFig.2,thefirst(resp.,second)jobofeachtaskhasa Incontrast,ataskoratasksetthatcanhavemultiple(poten- lighter(resp.,darker)shading,andintra-taskparallelismis tiallyinfinite)instantiationssatisfyingitsspecification(e.g., possible(e.g.,J4 andJ4 intimeinterval[23,24)). minimumreleaseseparation)iscallednon-concrete. 1,1 1,2 By Property 1, any instantiation of an obi-task τv is an Thefollowingpropertiesfollowfromthistransformation i 4 Thus, a DAG-based task set can be transformed to an R1 1 obi-tasksetwiththesameper-taskparameters.Giventhese τ11 per-task parameters, a response-time bound for each obi- Φ11= 0 R21 taskcanbecomputedby(9)foranyarbitraryoffsetsetting. τ2 Then, we can properly set the offsets for each obi-task ac- 1 Φ12 R13 ceoacrdhinDgAtGo (i1n0t)opboylocgoincsaildeorridnegr,thstearctoinrrgeswpiothndΦin1g=task0sfoinr i τ3 eachsourcetaskτ1,by(8).ByProperty2,theresultingobi- 1 i Φ3 tasksetsatisfiesallrequirementsoftheoriginalDAG-based 1 R4 1 tasksystem,andbyProperty3,anend-to-endresponse-time τ4 1 boundR canbecomputedforeachDAGG . Φ4 i i 1 (Note that the response-time bound for a virtual end-to-end response time source/sinkisnotcomputedby(9),butiszerobydefinition, end-to-end response time sinceitsWCETiszero.Anyjobofsuchataskcompletesin R1 : the end-to-end response-time bound for G1 Time zerotimeassoonasitisreleased.) 0 5 10 15 20 25 30 Job Release Job Deadline Job Completion CPU Execution DSP Execution 5 SettingRelativeDeadlines (Assume depicted jobs are scheduled alongside other jobs, which are not shown.) Inthepriorsections,weshowedthat,byapplyingourpro- posed transformation techniques, an end-to-end response- Figure3:Examplescheduleoftheobi-taskscorrespondingtothe time bound for each DAG can be established, given ar- DAG-basedtasksinG1inFig.1. bitrary but fixed relative-deadline settings. That is, given Dv ≥ 0 for any i,v, we can compute correspond- i process.AccordingtoProperty2,aDAG-basedtasksetcan ing end-to-end response-time bounds (i.e., Ri for each i) beimplementedbyacorrespondingsetofobi-tasks,andall by(9),(8),(10),and(13). producer/consumerconstraintsintheDAG-basedspecifica- SimilarDAGtransformationapproacheshavebeenpre- tion will be implicitly guaranteed, provided the offsets of sented previously [11, 18], but under the assumption that theobi-tasksareproperlyset(i.e.,satisfy(10)). intra-taskprecedenceconstraintsexist(i.e.,jobsofthesame task must execute in sequence). Moreover, in this prior Property 2. If τk is a producer of τv in the DAG-based work, per-task relative deadlines have been defined in a i i tasksystem, then forthe jth jobsofthe correspondingtwo DAG-obliviousway.Byconsideringtheactualstructureof obi-tasks,fk ≤av . such a DAG, it may be possible to reduce its end-to-end i,j i,j response-timeboundbysettingitstasks’relativedeadlines Proof.By(7),ak =a1 +Φk,andbythedefinitionofRk, i,j i,j i i soastofavorcertaincriticalpaths. fk ≤ak +Rk.Thus, i,j i,j i Consider, for example, the DAG 10 (8) illustrated in Fig. 4. Suppose that fk ≤a1 +Φk+Rk. (11) i,j i,j i i theprioranalysisyieldsaresponse- 10 time bound of 10 for each task, (8) By(7),av =a1 +Φv,andby(10),Φv ≥Φk+Rk.Thus, i,j i,j i i i i as depicted within each node. The 10 (15) correspondingend-to-endresponse- av ≥a1 +Φk+Rk. (12) 10 i,j i,j i i time bound would then be 40 and (8) isobtainedbyconsideringtheright- By(11)and(12),fk ≤av . i,j i,j sidepath.Now,supposethatweal- 10 Property 3 shows how to compute an end-to-end ter the tasks’ relative-deadline set- (8) response-timeboundR . tings to favor the tasks along this i path at the possible expense of the Figure 4: More Property 3. In the obi-task system, for each j, all jobs remaining task on the left. Further, highly prioritizing Ji1,j,Ji2,j,··· ,Jin,ji finish their execution within Ri time suppose this modification changes the right-side path unitsaftera1i,where the per-task response-time bounds in this DAG de- creases its end-to- to be as depicted in parentheses. R =Φni +Rni. (13) end response-time i i i Then,this modificationwouldhave bound. the impact of reducing the end-to- Proof.By(7)andthedefinitionofRv,Jv finishesbytime i i,j endboundto32. a1 +Φv+Rv.Thus,Jni inparticularfinisheswithinΦni+ i,j i i i,j i Inthissection,weshowthattheproblemofdetermining Rini = Ri time units after a1i. Also, by (10), Φvi +Riv ≤ the“best”relative-deadlinesettingscanbecastasalinear- Φnii,sinceτini isthesinglesinkinGi.BecauseΦnii ≤ Ri, programmingproblem,whichcanbesolvedinpolynomial thisimpliesthat,foranyv,Jiv,j finisheswithinRitimeunits time.Theproposedlinearprogram(LP)isdevelopedinthe aftera1. nexttwosubsections. i 5 5.1 LinearProgram Thereare|E|distinctconstraintsinthisset,where|E|isthe totalnumberofedgesinallDAGsinthissystem. InourLP,therearethreevariablespertaskτv:Dv,Φv,and i i i Finally,wehaveasetofconstraintsthatarelinearequal- Rv.TheparametersT ,Cv,andm areviewedasconstants. i i i k ityconstraints. Thus,thereare3|V|variablesintotal,where|V|isthetotal numberoftasks(i.e.,nodes)acrossallDAGsinthesystem. ConstraintSet(iii): WithConstraintsSet(i),itisclearthat Beforestatingtherequiredconstraints,wefirstestablishthe wecanre-write(9)asfollows,foreachtaskτv, i followingtheorem,whichshowsthatarelative-deadlineset-   tingofDy >T ispointlesstoconsider. Tthheeroersepmonxs2e.-tIifmxDexbyo>undTxof, ttahseknτbyy,wseitltlindgecDrexyase=,aTnxd,eRacxyh, Riv =m1k Div·τlw(cid:88)∈Γkuwl +τlw(cid:88)∈Γk(cid:0)uwl ·(Tl−Dlw)(cid:1) x m −1 othertask’sresponse-timeboundwillremainthesame. + max {Cw}+ k Cv, (14) Proof. To begin, note that, by (9), τy does not impact τlw∈Γk l mk i x the response-time bounds of those tasks executing on CE where Γ = {τw|Pw = π }. Moreover, by (8), for each pools other than Pxy. Therefore, the response-time bounds DAGGik, l l k {Riv | Piv (cid:54)= Pxy} for such tasks are not altered by any Φ1i =0. changetoDy. x Intheremainderoftheproof,weconsiderataskτv such Constraint Set (iii) yields |V| + |G| linear equations, i that Pv = Py. Let Rv and R(cid:48)v denote the response-time where|G|denotesthenumberofDAGs. i x i i boundsforτv beforeandafter,respectively,reducingDy to ConstraintSets(i),(ii),and(ii)fullyspecifyourLP,with i x T .Ifi=xandv =y,thenby(9), theexceptionoftheobjectivefunction.InthisLP,thereare x 3|V|variables,2|V|+|E|inequalityconstraints,and|V|+ Ry −R(cid:48)y =Dxy−Tx · (cid:88) uw+ |G|linearequalityconstraints. x x m l k τw∈τk l 5.2 ObjectiveFunction uy x (max{0,T −Dy}−max{0,T −T }) m x x x x Different objective functions can be specified for our LP k thatoptimizeend-to-endresponse-timeboundsindifferent >{sinceDy >T } x x senses.Here,weconsiderafewexamples. 0. Single-DAG systems. For systems where only a single Alternatively,ifi(cid:54)=xorv (cid:54)=y,thenby(9), DAG exists, the optimization criterion is rather clear. In order to optimize the end-to-end response-time bound of Rv−R(cid:48)v =uyx (max{0,T −Dy}−max{0,T −T }) thesingleDAG,theobjectivefunctionshouldminimizethe i i m x x x x end-to-endresponse-timeboundoftheonlyDAG,G .That k 1 ={sinceDy >T } is,thedesiredLPisasfollows. x x 0. minimize Φn1 +Rn1 1 1 subjectto ConstraintSets(i),(ii),and(iii) Thus,thetheoremfollows. By Theorem 2, the reduction of Dy mentioned in the x Multiple-DAG systems. For systems containing multiple theoremdoesnotincreasetheresponse-timeboundforany DAGs, choices exist as to the optimization criteria to con- task. By (10), this implies that none of the offsets, {Φv}, i sider.Welisttwohere. needstobeincreased.Therefore,byProperty3,noend-to- endresponse-timeboundincreases.Thesepropertiesmoti- Minimizingtheaverageend-to-endresponse-time vateourfirstsetoflinearconstraints. bound: ConstraintSet(i): Foreachtaskτv, i (cid:88) minimize (Φni +Rni) 0≤Dv ≤T . i i i i i subjectto ConstraintSets(i),(ii),and(iii) 2|V|individuallinearinequalitiesarisefromthisconstraint set,where|V|isthetotalnumberoftasks. Another issue we must address is that of ensuring that Minimizing the maximum end-to-end response- the offset settings, given by (10), are encoded in our LP. timebound: Thisgivesrisetothenextconstraintset. minimize Y ConstraintSet(ii): ForeachedgefromτiwtoτivinaDAG subjectto ∀i: Φni +Rni ≤Y G , i i i Φv ≥Φw+Rw. ConstraintSets(i),(ii),and(iii) i i i 6 T if all of its producers have already finished. For example, in Fig. 3, J3 cannot execute until time 9, even though its 1,1 G1 correspondingproducerjob,J1 ,hasfinishedbytime3. 1,1 Observed response times can be improved under G2 deadline-based scheduling without altering analytical response-timeboundsbyusingatechniquecalledearlyre- T/2 leasing[10].Whenearlyreleasingisallowed,ajobiseligi- G bleforexecutionassoonasallofitscorrespondingproducer [1,2] Corresponding jobshavefinished,evenifthisconditionissatisfiedbefore to Invocations of: G1 G2 G1 G2 G1 G2 itsactualreleasetime.Earlyreleasingdoesnotimpactana- DAG Releases Identical DAGs lyticalresponse-timebounds.Inanappendix,whyprovide a brief explanation as to why this is the case and consider Figure5:IllustrationofDAGcombining. anexampleschedulewhereearlyreleasingisallowed. 6 DAGCombining 8 CaseStudy In the application domain that motivates our work, the To illustrate the computational details of our analysis, we DAGs to be scheduled are typically of quite low utiliza- considerhereacase-studysystemconsistingofthreeDAGs, tionsandaredefinedbasedonarelativelysmallnumberof G , G , and G , which are specified in Fig. 6. π is a CE 1 2 3 1 templates that define various computational patterns. Two poolconsistingoftwoidenticalCPUs,andπ isaCEpool 2 DAGsdefinedusingthesametemplatearestructurallyiden- consisting of two identical DSPs. Thus, m = m = 2. 1 2 tical:theyaredefinedbygraphsthatareisomorphic,corre- ThesethreeDAGshavefewernodesandhigherutilizations spondingnodesfromthetwographsperformidenticalcom- thantypicallyfoundinourconsideredapplicationdomain. putations, the source nodes are released at the same time, However,onecanimaginethatthesegraphswereobtained etc.Suchstructurallyidenticalgraphscanbecombinedinto fromcombiningmanyidenticalgraphsoflowerutilization. one graph with a reduced period and larger utilization, as Whileitwouldhavebeendesirabletoconsiderlargergraphs long any overutilization of the underlying hardware plat- withmorenodes,graphsfromourchosendomaintypically form is avoided. Such combining can be a very effective have tens of nodes, and this makes them rather unwieldy technique,becauseastheexperimentspresentedlatershow, to discuss. Still, the general conclusions we draw here are ourresponse-timeboundstendtobeproportionaltoperiods. applicabletolargergraphs. We illustrate this idea with a simple example. Consider Utilization check. First, we must calculate the total uti- twoDAGsG andG withacommonperiodofT thatare 1 2 lization of all tasks assigned to each CE pool to make structurally identical. A schedule of these two DAGs is il- sure that neither is overutilized. We have (cid:80) uv = lustratedabstractlyatthetopofFig.5.Asillustratedatthe τiv∈Γ1 i bottomofthefigure,ifthesetwoDAGsarecombined,then 200+100+300 + 133+78+197+73+5 = 1.686 < 2, and 500 1000 theyarereplacedbyastructurallyidenticalgraph,denoted (cid:80) uv = 380 + 16+83+242 =1.101<2. τiv∈Γ2 i 500 1000 here as G , with a period of T/2. With this change, the [1,2] Virtual source/sink. Note that all DAGs have a single providedresponse-timeboundshavetobeslightlyadjusted. sourceandsink,exceptforG ,whichhastwosinks.Forit, 2 Forexample,ifG hasaresponse-timeboundofR , [1,2] [1,2] we connect its two sinks to a single virtual sink τ6, which then this would also be a response-time bound for G , but 2 1 hasaWCETof0andaresponse-timeboundof0.Wecall thatforG wouldbeR + T,becauseincombiningthe 2 [1,2] 2 the resulting DAG G(cid:48). For convenience, we include a de- two graphs, the releases of G are effectively shifted for- 2 2 pictionofthisgraphinFig.10inanappendix. ward by T time units. While this graph combining idea is 2 Implicit deadlines. We now show how to compute reallyquitesimple,theexperimentspresentedlatersuggest response-time bounds assuming implicit deadines, i.e., thatitcanhaveaprofoundimpactintheconsideredappli- Dv = 500 for 1 ≤ v ≤ 4, Dv = 1000 for 1 ≤ v ≤ 5 cation domain. In particular, in that domain, per-DAG uti- 1 2 (the relative deadline of the virtual sink is irrelevant), and lizations are low enough that upwards of 40 DAGs can be Dv =1000for1≤v ≤3.Inordertoderiveanend-to-end combinedintoone.Thus,theactualperiodreductionisnot 3 merelybyafactorof 1 butbyafactorashighas 1 . response-timebound,wefirsttransformtheoriginalDAG- 2 40 based tasks into obi-tasks as described in Sec. 3. Next, we calculate a response-time bound Rv for each obi-task τv 7 EarlyReleasing i i by(9).Theresultingtaskresponse-timebounds,{Rv},are i Transforming a DAG-based task system to a correspond- listedinTable2,whichisgivenintheappendix.Notethat, ing obi-task system enabled us to derive an end-to-end asavirtualsink,theresponse-timeboundforthevirtualsink response-timeboundforeachDAG.However,suchatrans- τ6 doesnotneedtobecomputedby(9),butis0bydefini- 2 formationmayactuallycauseobservedend-to-endresponse tion.By(8)and(10),theoffsetsoftheobi-taskscannowbe timesatruntimetoincrease,becausetheoffsetsintroduced computed in topological order with respect to each DAG. inthetransformationmaypreventajobfromexecutingeven The resulting offsets, {Φv}, are also shown in Table 2. i 7 T = 500 1 τ2 T2= 1000 τ3 τ4 1 2 2 T = 1000 τ11 CP1221== 3π820 τ14 τ21 τ22 CP2233== π823 CP2244== π1197 3 τ31 τ32 τ33 CP1111== π2100 C13τ=13 100 CP1441== π3100 CP1221== π1133 CP2222== π126 C25τ=25 78 CP1133== π713 CP3322== π2242 CP3333== π51 P31= π1 P5= π 2 1 (a)G1 (b)G2 (c)G3 Figure6:DAGsinthecase-studysystem.G hastwosinks,sotoanalyzeit,avirtualsinkτ6mustbeaddedthathasaWCETof0anda 2 2 response-timeboundof0.WeshowtheresultinggraphinFig.10inanappendix. Finally, by Property 3, we have an end-to-end response- 9 SchedulabilityStudies time bound for each DAG: R = Φ4 + R4 = 2538.25, 1 1 1 Inthissection,weexpanduponthespecificcasestudyjust R =Φ6+R6 =4361.5,andR =Φ3+R3 =3376.5. 2 2 2 3 3 3 describedbyconsideringgeneralschedulabilitytrendsseen LP-based deadline settings. If we use LP techniques to forrandomlygeneratedtasksystems. optimizeend-to-endresponse-timebounds,thenchoicesex- istregardingtheobjectivefunction,becauseoursystemhas 9.1 ImprovementsEnabledbyBasicTechniques multipleDAGs.Weconsiderthreechoiceshere. We first consider the improvements enabled by the ba- Minimizing the average end-to-end response-time sic techniques covered in Secs. 4 and 5 that underlie our bound. For this choice, relative-deadline settings, obi-task work: allowing intra-task parallelism as provided by the response-timebounds,andobi-taskoffsetsareasshownin npc-sporadictaskmodel,anddeterminingrelative-deadline Table3(a),foundintheappendix.Theresultingend-to-end settingsbysolvinganLP. response-time bounds are R = Φ4 + R4 = 3134.5, 1 1 1 Random system generation. We considered a heteroge- R =Φ6+R6 =2341.2,andR =Φ3+R3 =1736.2. 2 2 2 3 3 3 neousplatformcomprisedofthreeCEpools,eachconsist- Minimizing the maximum end-to-end response-time ingofeightidenticalCEs.Eachpoolwasassumedtohave bound. For this choice, relative-deadline settings, obi-task thesametotalutilization.Weconsideredallchoicesoftotal response-timebounds,andobi-taskoffsetsareasshownin per-poolutilizationsintherange[1,8]inincrementsof0.5. Table3(b).Theresultingend-to-endresponse-timebounds We generated DAG-based task systems using a method are R1 = Φ41 +R14 = 2650.4, R2 = Φ62 +R26 = 2650.4, similar to that used by others [3, 15]. These systems were andR3 =Φ33+R33 =2650.4. generated by first specifying the number of DAGs in the Minimizing the maximum proportional end-to-end system, N, and the number of tasks per DAG, n. For response-timebound.Forthischoice,relative-deadlineset- each considered pair N and n, we randomly generated 50 tings, obi-task response-time bounds, and obi-task off- task-systemstructures,eachcomprisedofN DAGswithn sets are as shown in Table 3(c). The resulting end-to-end nodes.Eachnodeinsuchastructurewasrandomlyassigned response-time bounds are R = Φ4 + R4 = 2208.9, tooneoftheCEpools,andforeachDAGinthestructure, 1 1 1 R =Φ6+R6 =4417.8,andR =Φ3+R3 =4261.0. one node was designated as its source, and one as its sink. 2 2 2 3 3 3 Further,eachpairofinternalnodes(notasourceorasink) Early releasing. As discussed in Sec. 7, early releasing was connected by an edge with probability edgeProb, can improve observed response times without compromis- a settable arameter. Such an edge was directed from the ing response-time bounds. The value of allowing early re- lower-indexednodetothehigher-indexednode,topreclude leasing can be seen in the results reported in Table 1. This cycles.Finally,anedgewasaddedfromthesourcetoeach tablegivesthelargestobservedend-to-endresponsetimeof internalnodewithnoincomingedges,andtothesinkfrom each DAG, assuming implicit deadlines with and without eachinternalnodewithnooutgoingedges. earlyreleasing,inaschedulethatwassimulatedfor50,000 Foreachconsideredper-pooluntilizationandeachgen- timeunits.Analyticalboundsareshownaswell. erated task-system structure, we randomly generated 50 actual task systems by generating task utilizations using theMATLABfunctionrandfixedsum()[23].Accord- G1 G2 G3 ing to the application domain that motivates this work,2 Earlyreleasing 1006 897 453 we defined each DAG’s period to be 1 ms. (A task’s Noearlyreleasing 1966.75 3536.25 2586.0 WCET is determined by its utilization and period.) For Bounds 2538.25 4361.5 3376.5 2Inapplicationsusuallyconsideredinthereal-time-systemscommu- Table1:Observedend-to-endresponsetimeswith/withoutearly nity,muchlargerperiodsarethenorm.Theconsidereddomainisquite releasingandanalyticalend-to-endresponse-timeboundsforthe different. implicit-deadlinesetting. 8 identical DAGs will exist. We proposed the technique of ms)40 Bounds (35 cnnoppnccv--sseppnootirroaanddaiiccl sttaapssokkrssa,,d iLimcP pt-abliscakists ed, deim addpellaiincdietl isndeesadlines DduAcGerceospmobnisnei-ntgiminebSoeuc.nd6st.oWeexpnloowitdthisiscufassctsctohefduurtlhaebrilritey- Time 30 experimentsthatweconductedtoevaluatethistechnique. End-to-End Response-122505 RrcssayturanssundtsceeodtmdumormilesnystrsScugyeocescmtntu.eeprm9rera.si1tsig,encedegonxmoecsfrepyaprsNitttseioetmthdneasm.to,ptWfhilnaaeNsttteeesimas.DdApAsloidoGmfdysigie,lteadionrwnaeetaroapllgtryitoen,hncgaweetsrteaasdtsiiekonsd--f- m mu10 troduced a new parameter K that indicates the number of Maxi 5 identical DAGs per template. A period of 1 ms was still Average 01 2 3 4 5 6 7 8 aCsosomcpiaatreidsownitsheteuapch. DWAeGco.mparedtwostrategies:(i)dono Total Utilization in Each CE Pool combining, and compute end-to-end response-time bounds Figure 7: AMERBsasafunctionoftotalutilizationineachCE assuming N · K independent DAGs; (ii) combine identi- poolinthecasewhereeachtasksethasfiveDAGs,20tasksper cal DAGs, and compute end-to-end response-time bounds DAG,andedgeProb=0.5. assuming N DAGs, making adjustments as discussed in Sec.6toobtainactualresponse-timeboundsfortheDAGs each considered value of N, n, and total per-pool utiliza- thatwerecombined.Underbothstrategies,thegeneraltech- tion(onepointinoneofourgraphs),weconsidered50(task niquesevaluatedinSec.9.1wereapplied. systemstructures)×50(utilizations)=2,500tasksets. Results. Inallcasesthatweconsidered,theDAGcombin- Comparison setup. We compared three strategies: (i) ing technique improved end-to-end response-time bounds transforming to a conventional sporadic task system and significantly.Duetospaceconstraints,wepresenthereonly using implicit relative deadlines, which is a strategy used the case where each system has five templates, each of in prior work on identical platforms [18]; (ii) transform- which has 20 nodes, and edgeProb = 0.5. Other results ing to an npc-sporadic task system and using implicit rel- can be found online. For the considered case, Fig. 8 plots ative deadlines; and (iii) transforming to an npc-sporadic AMERBs as a function of total per-pool utilization, when task system and using LP-based relative deadlines. When the number of identical DAGs per template is fixed to 40 applying our LP techniques, we chose the objective func- (thisnumberisclosetowhatwouldbeexpectedintheappli- tionthatminimizesthemaximumend-to-endresponse-time cation domain that motivates this work). Also, Fig. 9 plots bound.Althoughanidenticalplatformwasassumedin[18], AMERBs as a function of the number of identical DAGs thetechniquesfromthatpapercanbeextendedtoheteroge- per template, when every CE pool is fully utilized (i.e., neousplatformsinasimilarwaytothispaper. the total utilization of each pool is eight). In this case, the AMERB metric was calculated over all task sets that have Results. Inallcasesthatweconsidered,thetwoevaluated thesamenumberofidenticalDAGspertemplate.Notethat techniquesimprovedend-to-endresponse-timebounds,of- theAMERBsinFig.8aremuchlowerthanthoseinFig.7, tensignificantly.Duetospaceconstraints,wepresenthere only the case where N = 5, n = 20, and edgeProb = even before applying the DAG combining technique. This is because the systems considered in Fig. 8 have far more 0.5.Foreachgeneratedtaskset,werecordedthemaximum DAGsthanthoseinFig.7.Asaresult,foreachgiventotal end-to-endresponse-timeboundamongitsfiveDAGs.For per-poolutilization,thesystemsinFig.8havemuchlower eachgiventotalper-poolutilizationpoint,wereportherethe per-DAGandper-taskutilizations. averageofthemaximumend-to-endresponse-timebounds amongthe2,500tasksetsgeneratedforthatpoint.Wecall Accordingtoourindustrypartners,intheconsideredap- thismetrictheaveragemaximumend-to-endresponse-time plicationdomain,aDAG’send-to-endresponse-timebound bound(AMERB).Fig.7plotsAMERBsasafunctionofto- shouldtypicallybeatmost2.35ms.AsobservedinFig.8, talper-CE-poolutilization.Asseen,theapplicationofboth intheabsenceofDAGcombining,AMERBsinthisexper- techniquesreducedAMERBsby39.42%to81.65%. iment were as high as 8.2 ms. However, the introduction ofDAGcombiningenabledadroptolessthan2.0ms,even In addition to this plot, we also considered other cases with different values for N, n, and edgeProb. These whentheplatformwasfullyutilized.Thisdemonstratesthat DAG combining—as simple as it may seem—can have a other results show similar trends and can be found in an online appendix (available at http://cs.unc.edu/ powerfulimpactinthetargeteddomain. ãnderson/papers.html). 10 Conclusion 9.2 ImprovementsEnabledbyDAGCombining We presented task-transformation techniques to provide AsmentionedinSec.6,intheapplicationdomainthatmo- end-to-endresponse-timeboundsforDAG-basedtasksim- tivates our work, DAGs are usually defined using several plementedonheterogenousmultiprocessorplatformswhere well-definedcomputationaltemplates,andasaresult,many intra-taskparallelismisallowed.WealsopresentedanLP- 9 msBounds ()10 DCoo mnboitn ceo imdebnintiec aidl eDnAtiGcasl DAGs [3] Sgwtre.iinbtBhueaotberuiudnsaSphdya.isscttkIermiimnbpgusr,ttoe1evd5ceh(rd2ne)aim:ql1-uu0teil7mtsi–.pe1IrE1os8cyEe,sEst2es0Tmo0rras4ng.bsloyabceatxilopsnlcoshioteindnuPglaasrbcaihllleietdyluaalnendahlDoylsiesiss- me 8 ofsporadicDAGtasksystems.In26thECRTS,2014. Ti Response- 6 [[54]] SSra..dBBicaarDruuAaahhG..tFaTeshkdeesryfaesttededemrsasct.ehdIendsu1cl8hintehgdDuolAfinTsgpEo,or2fa0dc1ioc5n.DstAraGinetads-kdesaydsltienmess.pIon- End 29thIPDPS,2015. End-to- 4 [6] SA..BWaireusaeh.,AV.gBenoenriaflaiczie,dAp.arMalalerclhteatstki-Smpoadceclafmoerlrae,cLur.rSentoturgeaiel-,tiamnde m Maximu 2 [7] bpirgo.cLeIsTseTsL.EInP3r3orcdeRssTinSgS.,20h1tt2p.://www.arm.com/products/processors Average 01 2 3 4 5 6 7 8 [8] V/stie.bcBihliontnyoilfaoangcaiiel,ysAs/bi.siMgilnaitrttclhehepertsotpic-oSerspasadicniccga.DpmhAepGl.a,tSas.kStmilloedr,eal.ndInA.2W5thiesEeC.RFTeaS-, Total Utilization in Each CE Pool 2013. [9] H.S.Chwa,J.Lee,K.M.Phan,A.Easwaran,andI.Shin.Globaledf Figure 8: AMERBsasafunctionoftotalutilizationineachCE schedulabilityanalysisforsynchronousparalleltasksonmulticore poolinthecasewherethenumberofidenticalDAGspertemplate platforms.In25thECRTS,2013. isfixedto40. [10] U.Devi.SoftReal-TimeSchedulingonMultiprocessors.PhDthesis, UniversityofNorthCarolina,ChapelHill,NC,2006. msBounds ()1102 DCoo mnboitn ceo imdebnintiec aidl eDnAtiGcasl DAGs [11] Gr2e0.s1pE4ol.lniosettt,iNm.eKsoimfa,uJt.oEmrioctkivseond,aCta.flLoiwus,aonndmJu.lAtincdoerers.oInn.2M0thinRimTCizSinAg, me [12] J.EricksonandJ.Anderson.ResponsetimeboundsforG-EDFwith- Response-Ti 8 [13] Jmo.uoCtdi.enFltrofaon-rtsaerseckaal,p-Vtrie.mcNeedepelainsrc,aelGlce.olRnasaptrpraalviicni,atstai.noIdnnsL1w.5iMtthh.OcPoPinnOdhDioti.IoSnA,a2lm0e1ux1let.ci-uDtiAonG. End 6 In30thSAC,2015. End-to- 4 [14] Tty.pGinrganfdoprireerarle-,tiCm.eLeamvabreedndneed,haentderYo.gSenoereolu.sOmputlitmipirzoecdesrasoprids.pInro7toth- m CODES,1999. mu Average Maxi 025 10 15 20 25 30 35 40 45 50 [[1156]] JJpe..raaLrLtaeiil,,dleAKla..tnaSAdsakgigfsrlu.aolwIblnaaahll2,,5sKCtchh..eAELdCugu,rRlaiaTnwnSgda,lf,2CoC0.r1.Gp3Ga.ilirlla.ll,leAalnndraelCayl.s-iLtsimuo.efAgtanlosakblysa.sliEsIDonfF2f6efodthr- Number of Identical DAGs per Template ECRTS,2014. [17] J.Li,A.Saifullah,K.Agrawal,C.Gill,andC.Lu.Capacityaugmen- Figure9:AMERBsasafunctionofthenumberofidenticalDAGs tationboundoffederatedschedulingforparallelDAGtasks.Techni- pertemplateinthecasewheretotalutilizationineachCEpoolis calreport,WashingtonUniversityinStLouis,2014. fixedtoeight. [18] C.LiuandJ.Anderson. Supportingsoftreal-timeDAG-basedsys- temsonmultiprocessorswithnoutilizationloss.In31stRTSS,2010. based method for setting relative deadlines and a DAG [19] C.Maia,M.Bertogna,L.Nogueira,andL.M.Pinho. Globaledf schedulabilityanalysisforsynchronousparalleltasksonmulticore combining technique that can be applied to improve these platforms.In22ndRTNS,2014. bounds.Weevaluatedtheefficacyoftheseresultsbyconsid- [20] A.Melani,M.Bertogna,V.Bonifaci,A.Marchetti-Spaccamela,and ering a case-study task system and by conducting schedu- G.C.Buttazzo.Response-timeanalysisofconditionalDAGtasksin lability studies. To our knowledge, this paper is the first multiprocessorsystems.In27thECRTS,2015. to present end-to-end response-time analysis for the con- [21] A.Parri,A.Biondi,andM.Marinoni.Responsetimeanalysisforg- edfandg-dmschedulingofsporadicDAG-taskswitharbitrarydead- sidered context. In future work, we intend to extend these line.In23rdRTNS,2015. techniques to deal with synchronization requirements and [22] ASaifullah,K.Agrawal,C.Lu,andC.Gill. Multi-corereal-time to incorporate methods for limiting interference caused by schedulingforgeneralizedparalleltaskmodels.In32ndRTSS,2011. contentionforsharedhardwarecomponentssuchascaches, [23] R. Stafford. Random vectors with fixed sum. http://www. memorybanks,etc. mathworks.com/matlabcentral/fileexchange/ 9700-random-vectors-with-fixed-sum. Acknowledgments: We thank Alan Gatherer, Lee Mc- [24] G.StavrinidesandH.Karatza. Schedulingmultipletaskgraphsin Fearin, and Peter Yan for bringing the problem studied in heterogeneousdistributedreal-timesystemsbyexploitingschedule this paper to our attention and for answering many ques- holeswithbinpackingtechniques. SimulationModellingPractice andTheory,19(1):540–552,2011. tionsregardingcellularbasestations. [25] K.YangandJ.Anderson. OptimalGEDF-basedschedulersthatal- lowintra-taskparallelismonheterogeneousmultiprocessors. InES- References TIMedia,2014. [1] P.Axer,S.Quinton,M.Neukirchner,R.Ernst,B.Dobel,andH.Har- tig.Response-timeanalysisofparallelfork-joinworkloadswithreal- timeconstraints.In25thECRTS,2013. [2] R.BajajandD.Agrawal.Schedulingmultipletaskgraphsinhetero- 10

DTIC AD1014923: Reducing Response Time Bounds for DAG-Based Task Systems on Heterogeneous Multicore Platforms PDF

1.8 MB·English

by Defense Technical Information Center

#additional_collections #dticarchive

Checking for file health...

Save to my drive

Quick download

Download

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview DTIC AD1014923: Reducing Response Time Bounds for DAG-Based Task Systems on Heterogeneous Multicore Platforms

See more

The list of books you might like

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.