Table Of ContentTask Allocation and Scheduling of Concurrent
Applications to Multiprocessor Systems
Kaushik Ravindran
Electrical Engineering and Computer Sciences
University of California at Berkeley
Technical Report No. UCB/EECS-2007-149
http://www.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-149.html
December 13, 2007
Copyright © 2007, by the author(s).
All rights reserved.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission.
TaskAllocationandSchedulingof
ConcurrentApplicationstoMultiprocessorSystems
by
KaushikRavindran
B.S.(GeorgiaInstituteofTechnology)2001
Adissertationsubmittedinpartialsatisfactionofthe
requirementsforthedegreeof
DoctorofPhilosophy
in
Engineering-ElectricalEngineeringandComputerSciences
inthe
GRADUATEDIVISION
ofthe
UNIVERSITYOFCALIFORNIA,BERKELEY
Committeeincharge:
ProfessorKurtKeutzer,Chair
ProfessorJohnWawrzynek
ProfessorAlperAtamtu¨rk
Fall2007
TaskAllocationandSchedulingof
ConcurrentApplicationstoMultiprocessorSystems
Copyright2007
by
KaushikRavindran
1
Abstract
TaskAllocationandSchedulingof
ConcurrentApplicationstoMultiprocessorSystems
by
KaushikRavindran
DoctorofPhilosophyinEngineering-ElectricalEngineeringandComputerSciences
UniversityofCalifornia,Berkeley
ProfessorKurtKeutzer,Chair
Programmable multiprocessors are increasingly popular platforms for high performance em-
bedded applications. An important step in deploying applications on multiprocessors is to allo-
cateandscheduleconcurrenttaskstotheprocessingandcommunicationresourcesoftheplatform.
When the application workload and execution profiles can be reliably estimated at compile time,
it is viable to determine an application mapping statically. Many applications from the signal pro-
cessingandnetworkprocessingdomainsarestaticallyscheduledonmultiprocessorsystems. Static
schedulingisalsorelevanttodesignspaceexplorationformicro-architecturesandsystems.
Owing to the computational complexity of optimal static scheduling, a number of heuristic
methods have been proposed for different scheduling conditions and architecture models. Un-
fortunately, these methods lack the flexibility necessary to enforce implementation and resource
constraints that complicate practical multiprocessor scheduling problems. While it is important to
findgoodsolutionsquickly,aneffectiveschedulingmethodmustalsoreliablycapturetheproblem
specificationandflexiblyaccommodatediverseconstraintsandobjectives.
Thisdissertationisanattempttodevelopinsightintoefficientandflexiblemethodsforallocat-
ing and scheduling concurrent applications to multiprocessor architectures. We conduct our study
in four parts. First, we analyze the nature of the scheduling problems that arise in a realistic ex-
ploration framework. Second, we evaluate competitive heuristic, randomized, and exact methods
for these scheduling problems. Third, we propose methods based on mathematical and constraint
programming for a representative scheduling problem. Though expressiveness and flexibility are
advantages of these methods, generic constraint formulations suffer prohibitive run times even on
2
modestly sized problems. To alleviate this difficulty, we advance several strategies to accelerate
constraintprogramming,suchasproblemdecompositions,searchguidancethroughheuristicmeth-
ods,andtightlowerboundcomputations. Theinherentflexibility,coupledwithimprovedruntimes
from a decomposition strategy, posit constraint programming as a powerful tool for multiproces-
sor scheduling problems. Finally, we present a toolbox of practical scheduling methods, which
provide different trade-offs with respect to computational efficiency, quality of results, and flexi-
bility. Our toolbox is composed of heuristic methods, constraint programming formulations, and
simulatedannealingtechniques. Thesemethodsarepartofanexplorationframeworkfordeploying
network processing applications on two embedded platforms: Intel IXP network processors and
XilinxFPGAbasedsoftmultiprocessors.
ProfessorKurtKeutzer
DissertationCommitteeChair
i
“Bettertoremainsilentandbethoughtafoolthantospeakoutandremovealldoubt.”
–AbrahamLincoln
ii
Contents
ListofFigures v
ListofTables vii
1 TheTrendtoSingleChipMultiprocessorSystems 1
1.1 DeployingConcurrentApplicationsonMultiprocessors . . . . . . . . . . . . . . . 3
1.1.1 TheImplementationGap . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 AMethodologytoBridgetheImplementationGap . . . . . . . . . . . . . 4
1.2 TheMappingProblemforMultiprocessorSystems . . . . . . . . . . . . . . . . . 5
1.2.1 StaticModels,StaticScheduling . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.2 ComplexityofStaticScheduling . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.3 CommonMethodsforStaticScheduling . . . . . . . . . . . . . . . . . . . 10
1.3 TheQuestforEfficientandFlexibleSchedulingMethods . . . . . . . . . . . . . . 12
1.4 ContributionsofthisDissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2 AFrameworkforMappingandDesignSpaceExploration 16
2.1 AFrameworkforMappingandExploration . . . . . . . . . . . . . . . . . . . . . 16
2.1.1 DomainSpecificLanguageforApplicationRepresentation . . . . . . . . . 17
2.1.2 TheMappingStep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.3 PerformanceAnalysisandFeedback . . . . . . . . . . . . . . . . . . . . . 19
2.2 TheNetworkProcessingDomain: ApplicationsandPlatforms . . . . . . . . . . . 20
2.2.1 NetworkProcessingApplications . . . . . . . . . . . . . . . . . . . . . . 21
2.2.2 IntelIXPNetworkProcessors . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.3 XilinxFPGAbasedSoftMultiprocessors . . . . . . . . . . . . . . . . . . 24
2.3 ExplorationFrameworkforNetworkProcessingApplications . . . . . . . . . . . . 26
2.3.1 DomainSpecificLanguageforApplicationRepresentation . . . . . . . . . 27
2.3.2 TheMappingStep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3.3 PerformanceAnalysisandFeedback . . . . . . . . . . . . . . . . . . . . . 30
2.4 MotivationforanEfficientandFlexibleMappingApproach. . . . . . . . . . . . . 30
3 ModelsandMethodsfortheSchedulingProblem 31
3.1 ModelsforStaticScheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.1.1 TheApplicationTaskGraphModel . . . . . . . . . . . . . . . . . . . . . 32
3.1.2 TheMultiprocessorArchitectureModel . . . . . . . . . . . . . . . . . . . 34
iii
3.1.3 PerformanceModelfortheTaskGraph . . . . . . . . . . . . . . . . . . . 35
3.1.4 OptimizationObjective . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.1.5 ImplementationandResourceConstraints . . . . . . . . . . . . . . . . . . 37
3.2 MethodsforStaticScheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2.1 HeuristicMethods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2.2 ListSchedulingusingDynamicLevels . . . . . . . . . . . . . . . . . . . 42
3.2.3 EvolutionaryAlgorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2.4 SimulatedAnnealing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.2.5 EnumerativeBranch-and-Bound . . . . . . . . . . . . . . . . . . . . . . . 45
3.2.6 MathematicalandConstraintProgramming . . . . . . . . . . . . . . . . . 45
3.3 SchedulingToolsandFrameworks . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.4 TheRightMethodfortheJob . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4 ConstraintProgrammingMethodsforStaticScheduling 50
4.1 ARepresentativeStaticSchedulingProblem . . . . . . . . . . . . . . . . . . . . . 50
4.1.1 MultiprocessorArchitectureModel . . . . . . . . . . . . . . . . . . . . . 51
4.1.2 ApplicationTaskGraph . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.1.3 ExecutionTimeandCommunicationDelayModels . . . . . . . . . . . . . 52
4.1.4 ValidAllocation,ValidSchedule . . . . . . . . . . . . . . . . . . . . . . . 53
4.1.5 OptimizationObjective . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.1.6 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.1.7 ImplementationandResourceConstraints . . . . . . . . . . . . . . . . . . 55
4.1.8 ComplexityoftheSchedulingProblem . . . . . . . . . . . . . . . . . . . 57
4.2 AMixedIntegerLinearProgrammingFormulation . . . . . . . . . . . . . . . . . 58
4.2.1 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2.2 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.3 MappingResultsforNetworkProcessingApplications . . . . . . . . . . . . . . . 61
4.3.1 IPv4PacketForwardingonFPGAbasedSoftMultiprocessors . . . . . . . 62
4.3.2 DifferentiatedServicesontheIXP1200NetworkProcessor . . . . . . . . . 69
4.4 ACaseforanEfficientandFlexibleMappingApproach . . . . . . . . . . . . . . 71
5 TechniquestoAccelerateConstraintProgrammingMethods 73
5.1 TheConceptofProblemDecomposition . . . . . . . . . . . . . . . . . . . . . . . 74
5.1.1 RelatedDecompositionApproachesforSchedulingProblems . . . . . . . 75
5.1.2 OverviewoftheDecompositionApproach . . . . . . . . . . . . . . . . . 75
5.2 ADecompositionApproachforStaticScheduling . . . . . . . . . . . . . . . . . . 76
5.2.1 MasterProblemFormulation . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.2.2 SubProblemDecompositionConstraints . . . . . . . . . . . . . . . . . . 80
5.2.3 AlgorithmicExtensionstoImprovePerformance . . . . . . . . . . . . . . 83
5.3 EvaluationoftheDecompositionApproach . . . . . . . . . . . . . . . . . . . . . 85
5.3.1 BenchmarkSet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.3.2 ComparisonstoHeuristicsandSingle-PassMILPFormulations . . . . . . 85
5.3.3 ExtensibilityofConstraintProgramming . . . . . . . . . . . . . . . . . . 89
5.4 AnEfficientandFlexibleMappingApproach . . . . . . . . . . . . . . . . . . . . 92
iv
6 AToolboxofSchedulingMethods 94
6.1 TheValueofGoodHeuristics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.1.1 DynamicLevelSchedulingRevisited . . . . . . . . . . . . . . . . . . . . 95
6.1.2 GuidanceforSearchinBranch-and-BoundMethods . . . . . . . . . . . . 97
6.1.3 EvaluationofHeuristicSearchGuidance . . . . . . . . . . . . . . . . . . 99
6.2 SimulatedAnnealingforLargeTaskGraphs . . . . . . . . . . . . . . . . . . . . . 102
6.2.1 AGenericSimulatedAnnealingAlgorithm . . . . . . . . . . . . . . . . . 102
6.2.2 AnnealingStrategyfortheRepresentativeSchedulingProblem . . . . . . . 103
6.3 EvaluationofSchedulingMethods . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.4 TheRightMethodfortheJob . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7 ConclusionsandFurtherWork 112
7.1 ConstraintProgrammingMethodsforScheduling . . . . . . . . . . . . . . . . . . 113
7.2 AToolboxofPracticalSchedulingMethods . . . . . . . . . . . . . . . . . . . . . 115
7.3 ExplorationFrameworkforNetworkProcessingApplications . . . . . . . . . . . . 118
Bibliography 120
Description:4 Constraint Programming Methods for Static Scheduling .. different models and optimization methods to solve the mapping problems that arose in I cherish our hallway discussions, gastronomic outings, poker nights, tennis and.