Table Of ContentTowards Parallelizable Sampling–based
Nonlinear Model Predictive Control
R.V.Bobiti M.Lazar
∗ ∗
DepartmentofElectricalEngineering,EindhovenUniversityofTechnology,
∗
TheNetherlands(e–mails:r.v.bobiti@tue.nl,m.lazar@tue.nl)
Abstract: This paper proposes a new sampling–based nonlinear model predictive control (MPC)
algorithm,withaboundoncomplexityquadraticinthepredictionhorizonN andlinearinthenumberof
samples.Theideaoftheproposedalgorithmistousethesequenceofpredictedinputsfromtheprevious
timestepasawarmstart,andtoiterativelyupdatethissequencebychangingitselementsonebyone,
starting from the last predicted input and ending with the first predicted input. This strategy, which
7 resemblesthedynamicprogrammingprinciple,allowsforparallelizationuptoacertainlevelandyields
1 a suboptimal nonlinear MPC algorithm with guaranteed recursive feasibility, stability and improved
0 costfunctionateveryiteration,whichissuitableforreal–timeimplementation.Thecomplexityofthe
2 algorithm per each time step in the prediction horizon depends only on the horizon, the number of
samples and parallel threads, and it is independent of the measured system state. Comparisons with
n
a thefminconnonlinearoptimizationsolveronbenchmarkexamplesindicatethatasthesimulationtime
J progresses,theproposedalgorithmconvergesrapidlytothe“optimal”solution,evenwhenusingasmall
2 numberofsamples.
1
Keywords:suboptimalnonlinearpredictivecontrol,sampling–basedoptimization,embeddedmodel
]
Y predictivecontrol,dynamicprogramming,constrainedcontrol
S
. 1. INTRODUCTION concerns in ENMPC related to robustness, feasibility of the
s
offline optimization and finding the neighbors in the sampled
c
[ Avastliteratureonnonlinearmodelpredictivecontrol(NMPC) gridfortheoff–gridstates,ENMPC,whensuccessful,reduces
has proven both the theoretical (Gru¨ne and Pannek, 2011), as significantlythecomputationalloadofMPCattheexpenseof
2 well as the practical advantage (Magni et al., 2009) of this anacceptablecostdegradation.
v
methodintreatingoptimallythecontrolofmulti–variablenon-
0 Input and state space sampling–methods for solving NMPC
linearsystemssubjecttoconstraintsonstateandinputs.Com-
6 via approximate DP have also been proposed, see (Bertsekas,
mon research interests include methods for reducing the com-
6 2005a), though they inherit the dimensionality issues of DP
plexity of the NMPC algorithms, to make them applicable on
2 (LeeandLee,2004).
devicesoflowmemory(ASIC,FPGA),suchasexplicitNMPC
0
. (ENMPC), see, e.g., (Johansen, 2004). Much work has been Another relevant sampling–based strategy, the so–called sam-
1
doneontreatingotherlimitingfactors,suchastherequirement pling based MPC (SBMPC), was proposed in (Dunlap et al.,
0
of NMPC of solving an optimization problem online. This is 2010). The method therein is applicable to nonlinear systems
7
1 not well achieved by common optimization tools, which have in general, though, its performance is dependent on a user–
: no specific termination time, especially due to non–convexity specified heuristic. A more ample discussion on sampling–
v which involves multiple local–minima. Therefore, real time basedDPandSBMPC,inthelightofthemethodproposedin
i
X requirementsarenotmet,whichlimitstheindustrialimpactof thispaperisreportedinSection2.3.
NMPC.Toalleviatetheseconcerns,theliteraturehasproposed
r Acommonproblemofsampling–guidedmethodsforNMPCis
a multiplesolutions,suchasapproximatedynamicprogramming
the sampling strategy. For example, with each input sample, a
(DP) (Bertsekas, 2005a), suboptimal MPC (Scokaert et al.,
treeisexpanded.Afterthetreeisbuilt,thepathofleastcostin
1999), approximative DP and suboptimal control via rollout
the tree is selected from the initial state to the desired state.
algorithms(Bertsekas,2005b),NMPCbasedonsystemapprox-
If the sampling is performed over the input space, and each
imationbyneuralmodels(Ławryn´czuk,2009).
sample is connected to all the samples in the input space for
AnalternativestrategyinNMPCistodrawsamplesfromeither the next time step in the control horizon, then the tree growth
the state or input space, to design computationally feasible isexponentialwiththehorizon.Alternatively,asinrandomized
NMPCmethods,seeforexample,(PiovesanandTanner,2009) MPC (Piovesan and Tanner, 2009), sampling randomly in the
which proposes a randomized approach to sampling the space input space, of dimension m, augmented to the horizon of
of predicted input sequences. More recently, in (Chakrabarty dimension N requires a large number of samples, in an mN
et al., 2016), an ENMPC method was proposed based on dimensionalspace,toachieveasignificantaccuracy.
sampling of the state space for continuously differentiable
In this paper we adopt a suboptimal formulation of NMPC,
nonlinear systems. The method therein solves optimization
as originally proposed in (Scokaert et al., 1999), where it
problems offline to find optimal control sequences, which are
wasshownthatfeasibilityofasolutionimpliesstabilityunder
used to construct the ENMPC strategy. While there are still
suitableconditions.This,togetherwiththefactthatsuboptimal max{|x |,...,|x |}. For a scalar x ∈ R, denote by (cid:100)x(cid:101) the
1 n
NMPChasthesameinherentrobustnesspropertiesasoptimal smallestintegernumberlargerthanx.
NMPC,see(Pannocchiaetal.,2011)and(LazarandHeemels,
A function α : R → R is said to belong to class K, i.e.,
2009),suggestthatsuboptimalNMPCisaviableandinfactthe + +
α ∈ K, if it is continuous, strictly increasing and α(0) = 0.
bestonecanhopeforwhenasampling–guidedMPCstrategyis
Furthermore,α∈K ifα∈Kandlim α(s)=∞.
undertaken for the control of nonlinear systems. Furthermore, s
∞ →∞
we aim at a sampling method which provides a suboptimal
solutionthatyieldsgoodcontrolperformance,hasareasonable
2.2 SuboptimalMPCproblemformulation
computationalcomplexityincreasewiththepredictionhorizon
andallowsforparallelimplementationuptosomelevel.
Letusconsiderthediscrete–timesystemdescribedby
In this paper, the main idea for achieving this goal is to use
x =f(x ,u ), (1)
theshiftedsequenceofpredictedinputsfromtheprevioustime k+1 k k
step as a warm start, and to iteratively update this sequence wherexk ∈ Rn isthestateanduk ∈ Rm isthecontrolvector
by changing its elements one by one, starting from the last at discrete–time k ∈ Z+. We assume that the map f : Rn ×
predicted input and ending with the first predicted input. This Rm → Rn satisfies f(0,0) = 0, which is, the origin is an
strategy resembles the dynamic programming principle, espe- equilibriumpointforsystem(1).
ciallytherolloutalgorithms,see(Bertsekas,2005b),whichim-
The goal of MPC is to regulate the state to the origin while
proveaheuristicbasepolicyforoptimalcontrol.Additionally, satisfyingcontrolandstateconstraints,i.e.,u ∈U⊂Rm and
k
in this paper, we sample the original input space, which is x ∈ X ⊂ Rn for all k ∈ Z , where U and X are proper
typicallyrepresentedbyapropersetU⊂Rm.Samplingallows k +
sets.MPCreliesonareceding–horizoncontrollawinorderto
for parallelization of the calculations performed for updating
determine,foreachk,afinite–sequenceofcontrolinputs
each of the elements of the predicted sequence of inputs and
U(k)={u ,u ,...,u },
it enables limiting the computational time according to the kk k+1k k+N 1k
| | − |
requirements of the considered application. An upper–bound where N is the control and prediction horizon, which are
onthecomplexityoftheoverallalgorithmisquadraticwiththe considered equal in this paper, for simplicity of exposition. If
predictionhorizonN andlinearwiththenumberofsamplesin theinitialstateisx = x andthecontrolsequenceisU(k),
kk k
U.Thisenablestheusageoflongpredictionhorizonsorreal– thesolutionofsystem| (1)inclosed-loopwithU(k)attimek+i
timeimplementationoninexpensivecomputingdevicessuchas isdenotedbyφ(x ,U(k),i).Thecurrentcontrolactionu ,is
kk k
ASICandFPGA.Thesuitabilityforreal–timeimplementation selectedasthefirst|controlactioninU(k),i.e.,u =u .
k kk
is also enhanced by the fact that the algorithm can be stopped |
atanyiterationperformedwithinasamplingperiod,whilethe Toachievethis,optimalMPCminimizes,ateachdiscrete–time
complexity of the calculations per iteration depends only on k,acostfunctionofthetype
the horizon N, the number of samples and parallel threads, N 1
(cid:88)−
and it does not depend on the measured state of the system. J(x ,U(k))=V (x )+ L(x ,u ), (2)
kk f k+N k k+ik k+ik
Moreover, the updated predicted sequence of inputs obtained | | | |
i=0
at any iteration will guarantee recursive feasibility, stability where J : Rn × Rm → R is the total cost function, V :
and an improved cost function under the same conditions as Rn →R isaterminalcosta+ndL:Rn×Rm →R isastafge
insuboptimalNMPC(Scokaertetal.,1999). + +
cost.TheminimizationisperformedwithrespecttoU(k)and
The remainder of this paper is organized as follows. In Sec- itissubjectto
tion 2, basic notation is introduced, together with the problem x =f(x ,u ), ∀j ∈Z , (3)
k+jk k+j 1k k+j 1k [1,N]
formulation and a discussion on the relation with the existing | − | − |
andtothestateandinputconstraints.
methods.Section3presentsthemainresultasaprototypealgo-
rithmanditscomplexityanalysis.InSection4,threeexamples When the dynamics f is a nonlinear, possibly non-convex
illustratethepotentialofthedevelopedmethod,andSection5 function,theoptimizationofthecost(2)cannotbeguaranteed
concludesthepaper. to converge to a global optimum, in general. Alternatively,
suboptimalMPCwasproposedasaviablealternative,see,e.g.,
2. PRELIMINARIES (Scokaert et al., 1999), to deal with this inherent shortcoming
ofnonlinearglobaloptimization.
2.1 Notation
Assume that the following constraints are required to hold at
LetR,R ,ZandZ denotethefieldofrealnumbers,theset eachiterationoftheMPCproblem:
+ +
of non–negative reals, the set of integers and the set of non– x ∈X,u ∈U, ∀i∈Z , (4)
k+ik k+ik [0,N 1]
negative integers, respectively. For every c ∈ R and Π ⊆ R, | | −
and
defineΠ :={k ∈Π|k ≥c}andsimilarlyΠ .Letint(S)
denote th≥ecinterior of a set S. Let Sh := S×..≤.c×S for any xk+N k ∈XT, (5)
h ∈ Z 1 denotetheh–timesCartesian–productofS ⊆ Rn.A where XT ⊆ X is a proper|set which represents a terminal
set S ⊂≥ Rn is called proper if it is non–empty, compact and constraint. Moreover, define by U(x ) the set of control
kk
0∈int(S). sequencesU(k)which,appliedonx ,|satisfy(3),(4)and(5).
kk
|
For a vector x ∈ Rn, the symbol (cid:107)x(cid:107) is used to denote Suboptimal MPC relies on an initial feasible solution, a warm
an arbitrary p–norm; it will be made clear when a specific start sequence U (k) ∈ U(x ) at each step k, which is
warm kk
norm is considered. For a vector x ∈ Rn, define |x| := improvediteratively.Thesuboptim|alMPCproblemconsidered
[|x |...|x |]T.Alsoforavectorx∈Rn,bymax|x|wedenote inthispaperisformulatedasfollows:
1 n
Problem2.1. For each k ∈ Z , given U (k) find a states are typically discretized to apply DP algorithms. The
1 warm
sequenceU(k)∈U(x )suchth≥at main idea is to discretize the state space with a finite grid and
kk
| to express each state outside of the grid as an interpolation of
J(x ,U (k))>J(x ,U(k)), (6)
kk warm kk nearbygridelements.Thesameinterpolationlawisappliedto
| |
andtheconstraints(3),(4)and(5)aresatisfied. computethecostofthecurrentnongridstateasafunctionofthe
costsofthenearbygridstates.Assuch,bydiscretizingboththe
Consider now a locally stabilizing control law k : X → U.
f T state space for each time in the control horizon and the input
AssumethatX isasublevelsetofV .ForstabilityoftheMPC
T f space, a transition diagram is obtained which approximates
closed–loopsystemitisalsorequiredthat:
the dynamics of the system in the continuous space. On this
• V (f(x,k (x)))+L(x,k (x))≤V (x)forallx∈X ; discretetransitionsystem,DPisappliedtodeterminethepath
f f f f T
• there exist α1,α2 ∈ K such that α1(|x|) ≤ Vf(x) ≤ with the smallest cost, which, for a given initial state xkk,
α (|x|)forallx∈X ;∞ providesthecontrolsequenceU(k). |
2 T
• thereexistα ∈ K suchthatL(x,u) ≥ α (|x|)forall
3 3 SolvingMPCwithDPviadiscretizationsuffersfromthe“curse
(x,u)∈X×U; ∞
of dimensionality”, due to sampling of both state and input
Remark2.2. ThefirstpropertyabovelistedimpliesthatXT is spacesandtherequirementforconstructingthecompletetran-
positivelyinvariantforthesystemxk+1 = f(xk,kf(xk)).The sitiondiagram,byevaluatingthesubsequentstateandcostfor
secondandthirdpropertiescanbesatisfiedif,forexampleVf eachsampledstateandallthesamplesintheinputspace.
andLarepositivedefinitequadraticfunctions.
An alternative to approximate DP, namely sampling based
These properties imply that the cost function J(·,·) is a Lya- MPC, was developed within the area of Robotics, where typ-
punovfunction,see(MayneandRawlings,2009,Lemma2.14). icallyoptimizationproblemsarisingincontrolarenon–convex,
Assuch,ifFisthesetofstatesinXforwhichthereexistsacon- due to either kinematic constraints or constraints posed by
trol sequence U(k) which satisfies the constraints (3), (4) and obstacle avoidance. Sampling–based motion planning such as
(5),thenthesolutiontoProblem2.1providesanasymptotically rapidly-exploringrandomtrees(RRTs)(LaValle,1998)orran-
stabilizingcontrollerwitharegionofattractionF. domized A* algorithms (Likhachev and Stentz, 2008), have
beenextensivelyusedtoconstructtreeswhichconnectaninitial
In(MayneandRawlings,2009),analgorithmwasproposedfor
state to a final state based on sampling states in the search
suboptimal MPC with stability guarantees. Given the current
spaceandsearchingforfeasibleinputstoconnectthesestates.
state x and the previous control sequence U(k − 1) as an
kk To approach issues related to the, possibly unfeasible, search
input,th|estepsofthealgorithmthereincanbesummarizedas
for an input after sampling only in the state space, a method
follows:
(SBMPC)wasproposedin(Dunlapetal.,2010),whichsamples
• Ifx ∈/ X ,usethewarmstartsequence: the input space at each sampling period and creates trees that
kk T
| containfeasiblestatetrajectories.Theoptimalpathtoagoalin
U (k)={u ,u ,...,
warm kk 1 k+1k 1 thestate–spaceisthensearchedforwithinthetreeusinggoal–
| − | −
uk+N 2k 1,kf(xk+N 1k 1)}. (7) directedsearchalgorithms,suchasLPA*.Suchalgorithmsrely
− | − − | −
SolveiterativelyProblem2.1viaoptimizationtoimprove on computing a heuristic measure of the distance from the
U (k) with a U(k) ∈ U(x ). Apply control u = current sample to the goal. Selecting the heuristic is however,
warm kk k
u . | notanobvioustaskforgeneralnonlinearsystems.
kk
• Ifx| ∈X ,setu =k (x ),or,similarlytothepre-
kk T kk f kk Therefore,adesirablefeatureofasampling–basedsuboptimal
vious| case, use the w| arm start|and solve an optimization
NMPCalgorithmisanon–exponentialgrowthinthetreegen-
algorithmiterativelytofindanimprovedcontrolsequence
erated through sampling of the state or input space at each
U(k)∈U(x ).
kk step in the horizon. Furthermore, it is also desirable to reduce
|
Optimization solvers, in both optimal and suboptimal MPC, thedependencyofthealgorithmonthenon–obviousselection
present difficulties in terms of parallelization and a priori of a heuristic, which significantly impacts the performance of
known execution time independently of the current state x . the sampling–based strategy. To circumvent these issues, an
k
To circumvent these drawbacks and enable a computationally alternativesuboptimalstrategyforsamplingtheinputspaceis
efficient and parallelizable nonlinear MPC algorithm, we pro- proposedinthenextsection,basedonsequentiallyupdatinga
pose a sampling based approach to solving Problem 2.1, as warmstartfeasiblesequenceofpredictedinputs.
illustrated in Section 3. To this end, we first review existing,
similarapproacheswithinnonlinearMPC. 3. MAINRESULT
2.3 Existingapproachesbasedonsampling In this section we present the proposed sampling–based algo-
rithmforsolvingsuboptimalNMPCproblemsandweprovide
This section provides a brief in depth review of two main adetailedcomplexityanalysis.
existing approaches for NMPC based on sampling, namely
approximate DP and sampling–based MPC, which were also
3.1 Prototypealgorithm
mentionedintheIntroduction.
An approximate version of DP, as a tool for solving opti- Inspired by the suboptimal MPC and DP principles, we pro-
mizationproblems,wasproposedin(Bertsekas,2005a,Section poseasampling–basedsolutiontoProblem2.1,which,bythe
6.6.1). DP has been successfully applied for determining ex- mechanisminvolvedintheiterativeimprovementoftheinitial
plicit solutions for linear MPC controllers, see, e.g., (Mun˜oz feasible control sequence U (k), has a low increase with
warm
delaPen˜aetal.,2004).InnonlinearMPC,thestateandinput thecontrolhorizonofthecomputationalcomplexity.
j =0 j =1 j =2 • U(k)jq isafeasiblesequence,i.e.,theconstraints(3),(4)
uw00 uw2|0 u12|0 • athnedn(5e)whoclods,t, J = J(x ,U(k)jq) decreases with
| new kk
uw u22|0 respecttoJ(xk|k,Uwarm(k)),|
10
| u3 then Uwarm(k) is replaced by U(k)jq and the algorithm con-
2|0 tinues backwards with respect to the prediction time j, in a
similarmanner.Withthisapproach,atanypointintime,ifthe
maximallyallowedcomputationaltimeisexceeded,afeasible,
j =0 j =1 j =2 improved control sequence exists and it can be utilized as a
u310 suboptimalMPCsolution.
u4 |
10
uw | By choosing to cover the control horizon backwards, we can
00
| u1 reuse at each step j, the states φ(x ,U (k),i) for all
1|0 uw1|0 uw2|0 =u32|0 i ∈ Z[1,j], which, by the feasibilityk|kof Uwwaramrm(k), already
u2 satisfy the state and input constraints. This holds not only for
10
| the original U (k), but for any subsequent improvement
warm
of U (k). As such, also the stage costs up to the jth state
warm
j =0 j =1 j =2 can be recovered from previous computations. The reusability
u100 uw =u4 of previously computed costs and states by navigating the
| 10 10
| | control horizon in a backward manner resembles the working
uw0|0 u200 principleofDP.Thissuggestsintuitivelythattheproposedcost
|
uw =u3 improvement method could deliver good performance, which
u300 u4 2|0 2|0 issupportedbyresultsobtainedinnon–trivialcasestudies,see
| 00
| Section 4. A formal analysis of convergence towards the DP
solution, as the discrete–time k increases, makes the object of
Fig.1.Samplingprinciple.
futureresearch.
Algorithm1Sampling–basedsuboptimalNMPCalgorithm. When k = 0, we can select an initial sequence Uwarm(0)
as the solution of the optimal MPC problem. In this case, we
Input: NU,w{anrjm}(jk∈)Z[=0,N{−u1]wk,kx,k.|k..,,Xu,wkU+,NXT1,kJ}(·,·) csiabnlepsroeqceueedncweiUthwkarm=(01),. aAnlt“eornraactilvee”l,yb,ywreancdaonmsleylescetleactfienag-
Output: U(k),u | − | sequences of inputs in U until a feasible U (0) is found.
k warm
In this case, if it is feasible to afford the computational time,
1: Jsub ←J(xkk,Uwarm(k)); wecanproceedwithAlgorithm1inanattemptofobtainingan
2: forallj =N|−1:−1:0do improvedsequence.
3: Selectnj samplesuqj+kk ∈U,q ∈Z[1,...,nj]; For k ∈ Z , to choose the input U (k) for Algorithm 1,
4: forallq =1:1:nj do| one can us≥e1the receding horizon prwinacrimple of MPC. As such,
5: ifj ≥1then theinputsequenceU (k)={u ,...,u ,u}
6: U(k)jq ={uwkk,...,uwk+j 1k,uqk+jk, is a warm start at twimaremk. If φ(xk|k−1 ,U(kk+−N−12)|,kN−1) ∈
uw| ,...,u−w| }|; X and X is positively invariankt−f1o|rk−th1e system x =
k+j+1k k+N 1k T T k+1
7: else | − | f(xk,kf(xk)), one can select u = kf(xk+N 1k 1). In this
8: U(k)jq ={uq ,uw ,...,uw }; case,U (k)isafeasiblesolution,andthere−for|e−acandidate
kk k+1k k+N 1k warm
9: φ(xkk,iUf (φk()xjkq|,kN,U)(∈k)XjqT,|it)he∈n X|,∀i ∈ Z[j+−1,N|−1] and arwenamdrmanionsstkafrfet(af·so)i,rblteehv,eeinr.eyo.,knφe∈(cxZan≥1se.lIefctth,Ueure∈exUis(tsksu)nc,ohNtte)hra∈mtiXUna.wlIansremtthX(eksTe)
10: | Jnew ←J(xkk,U(k)jq); circumstances, however,ks−ta1|bki−li1ty owfarthme closed loop is not
11: ifJnew <Jsub| then guaranteed.SuchanexampleisillustratedinSection4.3.
12: Jsub ←Jnew; Remark3.1. Common sampling schemes are employed for
13: Uwarm(k)←U(k)jq; sampling of the input space U at each iteration, among which
14: U(k)=Uwarm(k),uk =uwkk; weconsiderdeterministicuniformsampling,whichplaceseach
|
sampleatequaldistancefromeachother,tocoveruniformlythe
spaceU.Alternatively,“true”randomsamplescanbeselected,
Theprinciplebehindtheproposedsampling–basedapproachis
whicharesimplertodrawinhigherdimensionalspaces.Quasi–
illustratedinFig.1andformalizedinAlgorithm1.InFig.1,the
random low–discrepancy sequences, see (Chakrabarty et al.,
iterative improvement of an initial cost provided by an initial
2016), may be used as well, considering the fact that they
feasible control sequence U (k) is illustrated for the case
warm appeartoberandomformultiplepurposes,suchasMonteCarlo
when N = 3. The algorithm keeps always U (k) as a
warm simulations.Suchsamplingmethods,e.g.,SobolorHaltonse-
reference sequence, and it covers the horizon in a backward
quences, have been shown, see, e.g., (de Dios Ortu´zar et al.,
fashion, in N iterations. Starting with j = N − 1, at each
iintepruatticoonnsstterpa,inntjsesatmUp.lFeosr{euaqkc+hjs|kam}qp∈lZe[,1,tnhje]raerfeedrernawcensferqoumenthcee 1R9e9m4a)r,kto3.b2e.ttAerscsouvmeertthhaetskpfaciseathlaoncarlalnydsotambisleizqiunegnccoens.trollaw
U (k)ismodifiedinthejth location,andanewsequence, onXT,asublevelsetofVf,whichispositivelyinvariantforthe
warm
U(k)jq isobtained.Ifthefollowingpropertieshold: systemxk+1 = f(xk,kf(xk))andVf andLare,e.g.,positive
definite quadratic functions. Considering that the sequence N(N +1)
C =c +c N. (10)
U(k)providedbyAlgorithm1isasuboptimalsolutionsolving 1 2
2
Problem2.1,then,bytheargumentsinSection2.2,Problem2.1
The complexity bound given in (10), though quadratic in the
is recursively feasible and it ensures stability of system (1) in
predictionhorizon,yieldsareasonablecomplexity,considering
closedloopwithuk =uwkk. that, in NMPC, a horizon N = 10 is considered a reasonably
|
Remark3.3. The working mechanism of Algorithm 1 resem- largehorizon.
bles the working principle of the rollout algorithm described Ingeneral,ifwehavep∈Z processors,thenthecomplex-
in the survey paper (Bertsekas, 2005b). Therein, a rollout al- [2,n)
ityofAlgorithm1is
gorithm improves iteratively a base policy (here, the feasible
(cid:18) (cid:19)
N(N +1)
control sequence Uwarm(k)) to provide a suboptimal solution C =(cid:100)n/p(cid:101) c1 +c2N , (11)
to an optimal control problem via approximative DP and sub- 2
optimalcontrol.Theworkingprincipleoftherolloutalgorithm where the term (cid:100)n/p(cid:101) appears due to the fact that a thread
withsamplingbasedMPC,i.e.,choosingattimek theiterated can not engage in computations related to a subsequent time
solution of the previous time instance, k−1, as a warm start, horizon until all the threads have finalized the computations
andasamplingstrategy,isnotprovidedtherein. relatedtothecurrenttimehorizonj.
Remark3.4. Due to the sampling procedure having a discon-
tinuous behaviour, the inputs might be varying more than it is 4. ILLUSTRATIVEEXAMPLES
safe for some applications. This problem might be alleviated
by either smoothening the inputs via interpolation of the ob- The sampling–based suboptimal NMPC strategy proposed in
tainedinputsequencewiththeinitialsequence,orbypenalizing Section3isillustratedonthreenonlinearsystems,tohighlight
∆u =u −u viaconstraintsorviathecostfunction,such variousfeaturesofthismethod.Alltestshavebeenperformed
k k k 1
thattheinputva−riabilityislimitedtoacceptablebounds. on a system with the following specifications: Intel Core i7-
3770CPU3.4GHz,16GBRAM,64-bitOS.
3.2 Complexityanalysis
4.1 Cart–springsystem
InordertoanalyzethecomplexityofAlgorithm1,thefollow-
The method developed in this paper will first be applied to
ingassumptionsareundertaken,foragivenstatex,inputuand
a system incorporating an exponential nonlinearity, i.e., the
inputsequencesU,U ,U :
1 2 model of a cart with mass M, which is moving on a plane,
1) Thecostofevaluatingf(x,u)andperformingafeasibility see(Raimondoetal.,2009).Thiscartisattachedtoawallvia
testf(x,u)∈Xisc ; aspringwithelasticconstantkvaryingwiththefirststatek =
1
2) ThecostofevaluatingJ(x,U)isc2; k0e−x1, where x1 stands for the displacement of the carriage
3) ThecostofcomparingJ(x,U ) < J(x,U )andchanging from the equilibrium position. A damper acts as a resistor in
1 2
J and U (k) if necessary, i.e., steps 11-13 in Algo- thesystem,withdampingh .Thediscretizednonlinearmodel
sub warm d
rithm1,isnegligible. ofthecartandspringsystemisthefollowing:
4) ThecurrentcostJ isinstantaneouslyavailableforcom- (cid:20) (cid:21)
sub x (k+1)
parison with each of the nj samples according to step 11 x(k+1)= x1(k+1) =f1(x(k))+F2u(k), (12)
2
in Algorithm 1 and each of the new sequences U(k)jq may
where
modifyJ ifthenewcostJ issmallerthanJ .
sub new sub (cid:20) (cid:21)
x (k)+T x (k)
Tinhge:complexityofAlgorithm1foragivenxk|k isthefollow- f1(x(k))= x2(k)−TsMρ10e−x1x1s(k2)−TshMdx2(k) ,
N 1 N 1 F2 =(cid:2)0 TMs (cid:3)T , (13)
(cid:88)− (cid:88)− where x is the velocity of the cart and u is an external force
C =c1 (N −j)nj+c2 nj. (8) 2
whichactsasaninputtothesystem.Theparametervaluesare
j=0 j=0
T =0.4s,ρ =0.33,M =1,h =1.1.
Ifweassumen ≤nforallj ∈Z ,thenthecomplexity s 0 d
j [0,N 1]
(8)canbeupperboundedby − The MPC controller has to steer the cart to the origin from
a non–zero initial state, while satisfying the input and state
N(N +1)
C =nc +c Nn. (9) constraints,whichare
1 2
2
|u|≤4.5,|x |≤2.65, (14)
A possible reduction of the bound (9) might be attained, con- 1
and reducing the cost (2), where the stage cost and terminal
sideringthefactthatallthestatessubsequenttoanon–feasible
cost are quadratic functions, i.e., L(x,u) = xTQx + uTRu
state are no longer evaluated and checked for feasibility. This
meansthat,instep9ofAlgorithm1,ifφ(x ,U(k)jq,i)∈/ X andVf(x) = xTPx.Choosethefollowingparametersforthe
kk
for a specific i ∈ Z , then φ(x ,|U(k)jq,r) for all MPCproblem:
[j+1,N 1] kk (cid:20) (cid:21)
r ∈ Z , are no longe−r evaluated, in|which case steps 2) 7.0814 3.3708
[i+1,N] M =4,P = ,Q=diag(1,1),R=1.
and3)areskippedalltogether. 3.3708 4.2998
In(Raimondoetal.,2009)itisshownthatthecontrollaw
Consider now that many threads are available, from multiple
processors.Noticealsothatateachtimeinthehorizon,alln u=k (x)=−[0.8783 1.1204]f (x),
j f 1
computationscanbeperformedseparately.Intheseconditions, islocallystabilizingintheset
assuming we have n threads available, then the complexity of X ={x|V (x)≤4.7},
Algorithm1isupperboundedby T f
State trajectory evaluation time[s]
0.16
6
terminal set
fmincon 0.14
5 n=0
j 0.12
n=5
j
4 nnj==1300 me [s] 0.1
j ti0.08
3
0.06
x2 2
0.04
1 0.021 2 3 4 5 6 7 8 9 10 11 12
k
0
Fig.4.Computationtime.
−1 (cid:8)(cid:16)(cid:17)(cid:18)(cid:1)(cid:12)(cid:19)(cid:15)(cid:11)(cid:18)(cid:13)(cid:16)(cid:15)
(cid:3)(cid:6)(cid:2)
−2
−2 −1 0 1 2 (cid:3)(cid:5)(cid:2)
x
1
(cid:3)(cid:4)(cid:2)
Fig.2.Statetrajectory.
inputs (cid:3)(cid:2)(cid:2)
4
2 (cid:19)(cid:10) (cid:7)(cid:2)
(cid:17)
(cid:9)
0
u (cid:6)(cid:2)
−2
−4
(cid:5)(cid:2)
−6
2 4 6 8 10 12 14 16 18 20
k (cid:4)(cid:2)
Fig.3.Inputsappliedtothesystem.
(cid:2)
(cid:4) (cid:5) (cid:6) (cid:7) (cid:3)(cid:2) (cid:3)(cid:4)
which is a terminal set where the conditions for stability of (cid:14)
MPCmentionedinSection2.2aresatisfied.
Fig.5.ThecostJ .
sub
Algorithm1isappliedfortheMPCcontrolofsystem(12).We
comparetheresultsofthismethodwiththeresultsprovidedby
Computational performance. Computational details for N<=20.
fmincon in Matlab, even though, as it will be seen later, the 25 0.3
optimization tool does not always provide the optimum, due fmincon
20 sampling
to local minima. The scalability of the algorithm is tested by bound (9) 0.2
varyingboththenumberofsamplesnandthecontrolhorizon s]15 bound (10) s]
e [ e [
N.Thetestsillustratedinthispaperareperformedonafeasible m m
initialstatex =[−2.5,3].Thechoiceoftheinitialcondition ti10 ti0.1
00
does not influ|ence greatly the results, which are similar for 5
otherfeasibleinitialstates. 0 0
20 40 60 80 100 5 10 15 20
For the first experiment, fix N = 10. An initial U (0) N N
warm
is provided by a random “oracle”. Consider n = n, for all
j
j ∈ Z , taking various values in the set {0,5,10,30}. Fig.6.Computationalperformanceversusproposedbounds.
[0,N 1]
For n = 0,−U (0) is propagated though iterations without
warm immediate smoothening of the trajectories and input sequence
anyinterventionorchangefromthesamplingmechanism.This
even for a small n. In Fig. 4, the computational time, without
servesasareference,tonoticetheimprovementsbroughtinby
parallelization,isillustrated.Atk = 1,thecomputationalcost
Algorithm 1. The results are illustrated in Fig. 2–5, where the
offindinganoracleisincluded.Fig.5illustratesthevaluesof
legendfromFig.2holdsuntilFig.5.Iterationsareconsidered
J for each iteration. Notice, overall, that even for a small
from k = 1 until k = 20. Though the different sampling sub
number of samples, the performance of the closed loop sys-
optionsproposedinRemark3.1provideinthisexamplesimilar
tem is significantly improved and the computational time is
outcomes, Halton points have been used here for illustration.
promising,evenforanon–parallelimplementation.Also,asthe
This choice is motivated by the practical feature that, adding
iterationkadvances,theinitiallymodestperformanceincreases
extrapointsonlywhentheyarerequireddoesnothaveimpact
onthecoverageofthesetU. significantly,duetothecontinuousimprovementofUwarmand
the receding horizon principle. Interestingly, even for n = 5,
In Fig. 2 and Fig. 3 the state trajectory and the inputs u(k) at iteration k = 4, the cost J (4) is smaller than the cost
sub
applied to the system are illustrated. The constraint specifica- computedviafmincon,which,duetolocalminima,provided
tions (14) are satisfied for all cases, at all times. Notice the afeasiblebutnotoptimalsolution.
For the second experiment we aim to test the computational Cost function
complexity of Algorithm 1 in terms of the control horizon. 250
fmincon
Theimplementationusedheredoesnotuseparallelization.Fix 200 n=0
n = 10andN takesvaluesintheset{3,10,20,50,100}.The j
n=5
rceosmulptslexairteyiflolursAtrlagtoerdithinmF1igis.d6e.pWictiethd,raendd,itthiesaclowmaypsutsamtiaolnlearl Jsub110500 nnjj==2500
j
thanthebound(9),whichisfollowedclosely.Itisexpectedthat
ondedicateddevices,thecomplexityofAlgorithm1issmaller, 50
due to the processors not running in parallel threads related
0
0 5 10 15 20 25 30 35 40 45 50
to other system applications. With parallelization, further re-
k
duction in complexity is expected, as described in Section 3.2
evaluation time[s]
and illustrated in Fig. 6. Notice that, for smaller horizons, 1.4
the complexity of Algorithm 1 is smaller than the complexity
1.2
of fmincon, even without parallelization. For N = 100,
1
fminconprovidessolutionswhicharenotfeasible,whilethe
0.8
proposedAlgorithm1stillterminatesin21.6seconds. t[s]0.6
0.4
4.2 Buck–Boostpowerconverter
0.2
0
Next, the bilinear model of a Buck–Boost power converter is 0 5 10 15 20 25 30 35 40 45 50
k
considered,asin(Spinuetal.,2011):
(cid:20)x(k)TC (cid:21) Fig.7.PerformanceofAlgorithm1onthepowerconverter.
x(k+1)=Ax(k)+Bu(k)+ 1 u(k), (15)
x(k)TC2 Alltheconstraintsweresatisfiedforallthepresentedsituations,
wherex := [v i ]T ∈ X ⊂ R2 isthestatevectorconsisting and it is expected that, through implementation on dedicated
C L
multi–thread systems, the computational time for the control
of the voltage across the output capacitor and the current
throughthefilterinductor.Theinputu:=[d d ]T ∈U⊂R2 lawdecreasestotheextentoffittingthetightsamplingperiod
1 2 ofthepowerconverter.
stands for the duty–cycle ratio of the control signal applied to
theswitchingnode.Theparametersare
4.3 Wheeledmobilerobot
(cid:18) (cid:20)− 1 0 (cid:21)(cid:19) (cid:20) 0 0(cid:21)
A= I2+Ts R0HC −RL ,B = vs 0 Ts, Thelastexampleillustratesthemethodologydevelopedinthis
L L
(cid:20) (cid:21) (cid:20) (cid:21) paperforanobstacleavoidancetaskbyanonholonomicsystem
0 0 0 −1
C = T ,C = L T , with trigonometric nonlinearities, due to kinematics, i.e., a
1 0 1 s 2 0 0 s
C model of the wheeled mobile robot (WMR), as described in
with the values RL = 0.2Ω, C = 22µF, L = 220µH, (Kuhneetal.,2005):
Ts =10µs. (cid:34)x (k)+u (k)cosx (k)T (cid:35)
1 1 3 s
The aim of the control loop is to stabilize the system to the x(k+1)= x (k)+u (k)sinx (k)T . (16)
2 1 3 s
equilibriumpointx := [20 0.5]T,u := [0.81 0.4]T,under x3(k)+u2(k)Ts
e e
theconstraintsi ∈ R ,v ∈ R ,u ∈ R2 .The In (16), the state x ∈ R3 describes the position and the ori-
L [0,3] C [ 0.1,22.5] [0,1]
terminalcontroller − entation of the robot with respect to a global inertial frame
{O,X,Y}, and the input u ∈ R2 gives the linear and an-
(cid:20)−0.0014 −0.3246(cid:21) gular velocity, respectively. The parameter Ts = 0.1s is the
u=u +K(x−x ),K = , discretizationperiodofsystem(16).
e e 0.0001 −0.0055
stabilizing the system in the terminal set X given in (Spinu The MPC strategy aims at driving the WMR from an initial
etal.,2011,setPinSectionIV.C),andtheqTuadraticcostwith state x = [0 6 0]T to the origin of the inertial frame, i.e.,
00
thematrices x = [|0 0 0]T, while satisfying the input constraints u ∈
(cid:20) (cid:21) g 1
46.6617 42.8039 R , u ∈ R . A quadratic cost function of
Q=diag(1,2),R=diag(1,1),P = , [ 0.47,0.47] 2 [ 3.77,3.77]
42.8039 69.4392 the−form −
satisfies all the conditions for stability formulated in Sec- N(cid:88)−1
tion2.2. J(xkk,U(k))=xTk+N kPxk+N k+ xTk+jkQ(j)xk+jk+
| | | | |
j=1
Similarly to the previous example, we apply Algorithm 1 for
N 1
the MPC control of system (15). Fix the initial state x00 = + (cid:88)− uT Ru
[1 2]T + x , which is outside of the terminal set X ,| and k+j|k k+j|k
e T j=0
N = 10. U (0) is given by an oracle and we use random
warm is considered, with the parameters: Q(j) = 2j 1Q, P =
sampling of U. See in Fig. 7 the effect of varying n on the −
50Q(N),Q=diag(1,1,0.5),R=diag(0.1,0.1),N =5.
cost function J and the time necessary, per iteration, to
sub
compute the corresponding control law. Notice the effect of In this example we illustrate Algorithm 1 with the above pa-
the unknown termination time on the evaluation time for the rameters. Consider U (0) generated by an “oracle”, and
warm
optimization performed through fmincon and the relatively sampling of U based on random sequences. The result for
equalcomputationaltimeofAlgorithm1overtheiterationsk. n = 30isillustratedinFig8.fminconcouldnotbeapplied
State trajectory around obstacle.
6 improvesintermsofcostaninitiallyfeasiblecontrolsequence.
ThissuboptimalNMPCstrategyprovidesapromisingcompu-
tationalcomplexityevenforlargecontrolhorizons,withgood
Algorithm 1
5 perspectives for parallel implementation. Future work aims at
SBMPC
obstacle investigatingtheconvergenceoftheproposedalgorithmtothe
optimal solution, the scalability in terms of system dimension
4
andtheeffectofparallelizationoncomputationaltimeforreal–
lifeapplications.
x23
REFERENCES
2
Bertsekas, D.P. (2005a). Dynamic programming and optimal
control,volume1. AthenaScientificBelmont,MA.
1 Bertsekas, D.P. (2005b). Dynamic programming and subopti-
malcontrol:AsurveyfromADPtoMPC. EuropeanJournal
ofControl,11(4),310–334.
0 Chakrabarty, A., Dinh, V., Corless, M., Rundell, A., Zak,
-1.6 -1.4 -1.2 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4
x S., and Buzzard, G. (2016). Support vector ma-
1
chine informed explicit nonlinear model predictive con-
Fig.8.TrajectoriesofWMR. trol using low-discrepancy sequences. IEEE Transactions
total verification time[s] on Automatic Control, preprint published on–line, DOI:
0.8 10.1109/TAC.2016.2539222,PP(99).
deDiosOrtu´zar,J.,Willumsen,L.G.,etal.(1994). Modelling
0.7
transport. WileyNewJersey.
0.6 Dunlap,D.D.,Caldwell,C.V.,andCollins,E.G.(2010). Non-
linear model predictive control using sampling and goal-
0.5 directed optimization. In 2010 IEEE International Confer-
me [s]0.4 Greu¨nncee, Lon. aCnodntPraonlnAepkp,liJc.a(t2io0n1s1,).13N49o–n1li3n5e6a.rImEEodEe.l predictive
ti control.Theoryandalgorithms. London:Springer–Verlag.
0.3
Johansen,T.A.(2004). Approximateexplicitrecedinghorizon
0.2 controlofconstrainednonlinearsystems. Automatica,40(2),
293–300.
0.1 Kuhne, F., Lages, W.F., and Da Silva, J.G. (2005). Point sta-
bilization of mobile robots with nonlinear model predictive
0
control.InIEEEInternationalConferenceMechatronicsand
50 100 150 200 250 300 350 400
Automation,2005,volume3,1163–1168.IEEE.
k
LaValle,S.M.(1998). Rapidly-exploringrandomtrees:Anew
Fig.9.TimeperformanceofWMR. tool for path planning. TR 98-11, Computer Science Dept.,
IowaStateUniversity.
duetofeasibilityissuesrelatedtonon–convexitycausedbythe
Ławryn´czuk, M. (2009). Computationally efficient nonlinear
presence of the obstacle. Notice the avoidance of the obstacle
predictivecontrolbasedonstate-spaceneuralmodels. InIn-
of the WMR under the control law generated by Algorithm 1.
ternational Conference on Parallel Processing and Applied
Due to the sampling mechanism, however, the inputs are not
Mathematics,350–359.Springer.
smooth, which could be alleviated, e.g., by an interpolation
Lazar, M. and Heemels, W. (2009). Predictive control of hy-
mechanism. For comparison we illustrate also the result using
brid systems: Input-to-state stability results for sub-optimal
SBMPC (Dunlap et al., 2010), with the same horizon N and
solutions. Automatica,45(1),180–185.
n = 30. Notice, in the state trajectory plot in Fig 8 the fact
Lee, J.M. and Lee, J.H. (2004). Approximate dynamic pro-
that, after 400 iterations, the WMR did not reach x yet, and
g grammingstrategiesandtheirapplicabilityforprocesscon-
the input values are not yet 0, which means that the sampling
trol:Areviewandfuturedirections. InternationalJournalof
mechanismdidnotconsiderthattheWMRisclosetothegoal.
ControlAutomationandSystems,2,263–278.
Inthiscase,itisrecommendedtosamplemoredenselyaround
Likhachev, M. and Stentz, A. (2008). R* search. Lab Papers
0inthesetU,ratherthanuniformlycoveringUwithsamples.
(GRASP),23.
NoticeinFig9that,duetothebuildingofatreeandthesearch Magni, L., Raimondo, D.M., and Allgo¨wer, F. (2009). Non-
in a tree for the best path, SBMPC is more computationally linear model predictive control: Towards new challenging
demanding. The complexity of Algorithm 1 with the given n applications,volume384.Springer,LecturenotesinControl
fitsthesamplingperiodT = 0.1softheWMR,whichmakes andInformationSciences.
s
themethodapplicableforreal–timecontrolofthissystem. Mayne,D.andRawlings,J.(2009). Modelpredictivecontrol:
theoryanddesign. Madison,WI:NobHillPublishing,LCC.
5. CONCLUSION Mun˜ozdelaPen˜a,D.,Alamo,T.,Bemporad,A.,andCamacho,
E.F. (2004). A dynamic programming approach for deter-
Inthispaper,analgorithmbasedonsamplingoftheinputspace mining the explicit solution of linear MPC controllers. In
at each time in the horizon was proposed, which iteratively IEEEConferenceonDecisionandControl.
Pannocchia,G.,Rawlings,J.B.,andWright,S.J.(2011). Con-
ditionsunderwhichsuboptimalnonlinearMPCisinherently
robust. Systems&ControlLetters,60(9),747–755.
Piovesan, J.L. and Tanner, H.G. (2009). Randomized model
predictive control for robot navigation. In IEEE Interna-
tionalConferenceonRoboticsandAutomation,2009,94–99.
Kobe,Japan.
Raimondo,D.M.,Limon,D.,Lazar,M.,Magni,L.,andCama-
cho,E.F.(2009). Min-maxmodelpredictivecontrolofnon-
linear systems: A unifying overview on stability. European
JournalofControl,15(1),5–21.
Scokaert,P.O.,Mayne,D.Q.,andRawlings,J.B.(1999). Sub-
optimal model predictive control (feasibility implies stabil-
ity). IEEE Transactions on Automatic Control, 44(3), 648–
654.
Spinu,V.,Lazar,M.,andvandenBosch,P.(2011). Anexplicit
state-feedback solution to constrained stabilization of dc-dc
power converters. In 2011 IEEE International Conference
onControlApplications(CCA),1112–1118.IEEE.