T R R ECHNICAL ESEARCH EPORT Local Pursuit as a Bio-Inspired Computational Optimal Control Tool by Cheng Shao, Dimitrios Hristu-Varsakelis CDCSS TR 2005-1 (ISR TR 2005-85) + CENTER FOR DYNAMICS C D - AND CONTROL OF S SMART STRUCTURES The Center for Dynamics and Control of Smart Structures (CDCSS) is a joint Harvard University, Boston University, University of Maryland center, supported by the Army Research Office under the ODDR&E MURI97 Program Grant No. DAAG55-97-1-0114 (through Harvard University). This document is a technical report in the CDCSS series originating at the University of Maryland. Web site http://www.isr.umd.edu/CDCSS/cdcss.html Report Documentation Page Form Approved OMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number. 1. REPORT DATE 3. DATES COVERED 2005 2. REPORT TYPE - 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Local Pursuit as a Bio-Inspired Computational Optimal Control Tool 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION Army Research Office,PO Box 12211,Research Triangle Park,NC,27709 REPORT NUMBER 9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSOR/MONITOR’S ACRONYM(S) 11. SPONSOR/MONITOR’S REPORT NUMBER(S) 12. DISTRIBUTION/AVAILABILITY STATEMENT Approved for public release; distribution unlimited 13. SUPPLEMENTARY NOTES The original document contains color images. 14. ABSTRACT see report 15. SUBJECT TERMS 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF 18. NUMBER 19a. NAME OF ABSTRACT OF PAGES RESPONSIBLE PERSON a. REPORT b. ABSTRACT c. THIS PAGE 11 unclassified unclassified unclassified Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 LocalPursuitasaBio-InspiredComputational (cid:1) OptimalControlTool ∗ ChengShaoa,DimitriosHristu-Varsakelisb, aDepartmentofMechanicalEngineeringandInstituteforSystemsResearch UniversityofMaryland,College Park,MD20742USA bDepartmentofAppliedInformatics,University ofMacedonia,Thessaloniki54006,Greece Abstract This paper explores the use of a bio-inspired control algorithm, termed “local pursuit”, as a numerical tool for computing optimalcontrol-trajectorypairsinsettingswhereanalyticalsolutionsaredifficulttoobtain.Inspiredbytheforagingactivities ofantcolonies,localpursuithasbeenthefocusofrecentworkoncooperativeoptimization.Itallowsagroupofagentstosolve a broad class of optimal control problems (including fixed final time, partially-constrained final state problems) and affords certainbenefitswithrespecttotheamountofinformation(descriptionoftheenvironment,coordinatesystems,etc.)required to solve the problem. Here, we present a numerical optimization method that combines local pursuit with the well-known techniqueofmultipleshooting,andcomparethecomputationalefficiencyandcapabilitiesofthetwoapproaches.Theproposed method method can overcome some important limitations of multiple shooting by solving an optimal control problem “in small pieces”. Specifically, the use of local pursuit increases the size of the problem that can be handled under a fixed set of computationalresources.Furthermore,localpursuitcanbeeffectiveinsomesituationswheremultipleshootingleadstoanill- conditionednonlinearprogrammingproblem.Thetrade-offisanincreaseincomputationtime.Wecompareourpursuit-based method withdirectmultiple shootingusinganexample thatinvolvesoptimalorbittransferofasimplesatellite. Key words: Co-operativecontrol,Optimization, Agents,Groupwork,Trajectories,Numericalmethods,Nonlinear programming 1 Introduction honey bees, which share information by “dancing” and distributethemselvesamongdifferentflowersaccording to the “profitability” of each location; schools of fish Overthepastdecade,researchershavebeenincreasingly thatswiminagile,tightformations;andants,whichuse lookingtocooperativestrategiesasameansforaddress- pheromone secretions for recruiting nest-mates and for ingsystemsproblemswhicharedifficulttosolvebysin- discovering efficient paths between their nest and food glesystems[18,5].Thepushforcooperationispartlydue sources[4].Observationsoftheseandothernaturalcol- to the maturation of technology that makes it possible lectives have motivated a series of works on the model- todevelopmeaningfulcooperationamong,say,groupsof ingofmovementinanimalgroups[4,2,9]aswellasother wheeledorflyingvehicles,andpartlybecauseoftheobvi- workoncooperativecontrolstrategies,fromdistributed oussuccessofbiologicalcooperativesystemswhichhave collective covering and searching [18,14], to estimating apparently evolved to “perform as more than the sum by groups [15,11], and biologically-motivated optimiza- of their parts” [6,13]. Notable examples include worker tion [5,3,8]. (cid:1) ThisworkwassupportedbytheNationalScienceFounda- In particular, [2] presented a decentralized organizing tion under Grant No. EIA0088081 and by ARO ODDR&E principle, inspired by the foraging behaviors of ant MURI01 Grant No. DAAD19-01-1-0465, (Center for Com- colonies, that allowed a group to optimize an initial municating Networked Control Systems, through Boston R2 University). pathbetweentwolocationson ,byhavingasequence ∗ Corresponding author. Tel: +30-2310-891721, Fax: +30- of group members travel from one location to the other 2310-891290. while each member points its velocity vector towards Emailaddresses: [email protected](ChengShao), its predecessor. This so-called “local pursuit” rule was [email protected](DimitriosHristu-Varsakelis). generalized to non-Euclidean environments [7], and 13April2005 later to a pair of pursuit-based algorithms that applied anumericaloptimizationmethod,titled“pursuit-based to a much broader class of optimal control problems multiple shooting” (PBMS), that combines a recently and to systems with non-trivial dynamics[8,16]. These reportedlocal pursuitstrategywithestablishednumer- algorithms require each agent in the collective to solve ical methods. In particular, we propose a modified ver- aseriesoflocallyoptimalcontrolproblemswithinsmall sion of the well-known multiple-shooting (MS) method neighborhoodsthataredefinedbyitscurrentstateand that uses local pursuit to overcome some of the limita- the state of its predecessor. Doing so defines a con- tions of the original MS technique. The PBMS method trol/trajectorysequencethat,undercertainconditions, thatisproposedhereinvolvessimulatinggroupofagents converges totheoptimum. which cooperate to solve (numerically) an optimal con- trolproblem.AgentsusestandardMStooptimizetheir Local pursuit was originally conceived as a means of short-termbehavior. solving optimal control problems in settings where mapping, communication and sensing capabilities were Acomparisonofthecomputationalcomplexityandsize severely limited. The algorithm manages to avoid the of the resulting problems formulated via PBMS ver- need for global information by breaking up the prob- susstandardMS,revealsthatcooperationamonggroup lem into many pieces, each to be optimized by leader- members can overcome some important limitations of followeragentpairs.Itskeyfeatureisareductioninthe MS.Ourpursuit-basedformulationallowsonetoalways range(measuredbytime,distanceorothermetric)over operate in the domain of manageable-sized problems, which computations must take place, by paying a price while still being able to treat problems with very large in terms of the number of agents that are necessary to numbers of segments. As a result, it enlarges the maxi- carry out the algorithm. For example, in order for the mumsizeofproblemsthatcanbehandledgivenafixed collective to solve an optimal control problem, it is not amountofstoragespace.Inaddition,itcanreducecom- necessary to have available a map of the environment, putationalerrorsduetoill-conditioningofthenon-linear an agreed-upon coordinate system, or even the coordi- programming problem that must be solved when MS is nates of the target state; only an initial feasible control used. The trade-off is that, for well-conditioned prob- is required. lems, demands more running time than MS in order to converge. Up to now, discussions of local pursuit have focused on broadeningthedomainofapplicabilityofthealgorithm Theremainderofthispaperisorganizedasfollows:Sec- and on the limiting properties of the agents’ trajecto- tion2definesalocalpursuitalgorithmwhichisapplica- ries.However,itslimitedinformationrequirementsmake bletooptimalcontrolproblemswithfixedfinaltimeand it a potentially useful tool in numerical trajectory op- partially-constrainedfinalstates,andgivesthemainre- timization, where there are often similar trade-offs to sults concerning the behavior of a collective under that be made. In particular, given a control system which algorithm.Section3beginswithabriefintroductionto mustbesteeredbetweentwostatesinafixedamountof numericaloptimizationandgoesontodiscussthepoten- time, one typically proceeds by “sampling” the system tial advantages of modifying multiple shooting (one of trajectory at pre-determined times, and solving a non- the best-known and widely used methods for numerical linear programming (NLP) problem to determine the optimalcontrol)toincludelocalpursuit.Anillustrative bestplacementofthesamplesinthestatespace,subject exampleispresentedinSection4wherewecomparethe toconstraintsgivenbytheequationsofmotionandcon- performanceofmultipleshootingtothatofpuruit-based tinuity,amongothers.Inthatsetting,thereisabalance multiple shooting inasatellite orbit transferproblem. that must be struck between the number of trajectory samples and the size of the time intervals that separate 2 A Bio-inspired Algorithm for Optimal Con- them.Ononehand,wewouldliketokeepthenumberof trol segmentssmall,sothattheassociatedNLPproblemhas reasonable storage requirements. If the number of seg- Local pursuit begins with a group of cooperating ments is too small however, then propagating the state “agents”, where the term “agent” refers to a copy of a vector from one sample point to the next may require dynamical system: high-order approximation of the equations of motion, andthusbecometime-consuming.If,ontheotherhand, the trajectory is sampled densely, then there may not x˙k =f(xk,uk), xk(t)∈Rn,uk(t)∈Ω⊂Rm (1) be adequate memory to solve the associated nonlinear fork =0,1,2....Theproblemofinterestisasfollows: programming (NLP) problem, which at the same time ispronetoamoresignificantaccumulationofnumerical Problem1 Find a trajectory x∗(t), and a final state errors. x∗(T),(T fixed)thatminimize The purpose of this paper is to explore the application (cid:1) of local pursuit as a computational tool in order to ad- t0+T J(x,x˙,t0,T)= g(x,x˙)dt+G(x(t0+T)) (2) dress the shortcomings outlined above. We introduce t0 2 subjecttotheconstraintsx(t0)=x0,andQ(x(t0+T))= trajectory defined by (4) (see Fig. 1 for illustration). 0. Agents do not monitor their predecessors continuously, butinsteadupdatestheirtrajectorieswiththesampling Hereitisassumedthatg(x(t),x˙(t))≥0, G(x(t0+T))≥ rateof1/δ.Thepreciserulesthatgovernthemovement 0andthatQ(·) isanalgebraic function ofthestate. ofeachagentare: Definition 1 GiventhefinalstateconstraintQ(x)=0, Algorithm1 (Sampled Local Pursuit): Identify the D theconstraint setofxis starting state x0 on and the constraint set SQ. Let x0(t) (t ∈ [0,T0]) be an initial trajectory satisfy- SQ ={x∈Rn|Q(x)=0}. ing (1) with x0(0) = x0, Q(x0(T0)) = 0. Choose the pursuit interval ∆ and updating interval δ such that 0<δ <∆≤T0. For any pair of fixed states a,b ∈ D ⊂ Rn, suppose the optimal trajectory from a to b with fixed final time (1) For k = 1,2,3..., let tk = k∆ be the starting time (minimizing J with respect to x only) is denoted by of the kth agent, i.e. uk(t) = 0, xk(t) = x0 for x∗(t).Thenthecostoffollowingx∗(·) isdenotedby: 0≤t≤tk. (2) When t = tk + iδ,i = 0,1,2,3,..., calculate the (cid:1) t0+T ⎧controlu∗t(τ)thatachieves(subj.to(1)): η(a,b,t0,T)(cid:1) t0 g(x∗(t),x˙∗(t))dt (3) ⎪⎪⎪⎪⎪⎨η(xk(t),xk−1(t),t,∆), if xk−1(t)∈/ SQ (τ ∈[t,t+∆]) subjecttox(t0)=a, x(t0+T)=b. ⎪⎪⎪⎪⎪⎩ηQ(xk(t),t,T −(t−tk)), if xk−1(t)∈SQ Now, let x∗(t) be the optimal trajectory (over T units (τ ∈[t,tk+T]) Tofhteimcoes)tfroofmfolalonwininitgiaxl∗s(t·a)tiesadetnoottheedcboyn:straintsetSQ. (3) A[tkpp+lyiδu,kt(kt)+=(iu+∗tk1+)iδδ)(ti)ft∆o +theiδk<th Tag,enort ffoorr tt ∈∈ (cid:1) [tk+iδ,tk+T)otherwise. ηQ(a,t0,T)(cid:1) t0+T g(x∗,x˙∗)dt+G(x∗(t0+T)) (4) Repeat fromstep2untilthekth agentreachesSQ. t0 =minJ(x,x˙,t0,T) (4) x subjecttox(t0)=a,Q(x(t0+T))=0. Initial Trajectory Limit Trajectory Locally Optimal 2.1 Algorithm Trajectory In previous works [8,16] we have proposed two classes Xk-1 of algorithms, one for problems with fixed boundary conditions (final time and final states), and another for Xk SQ ptpfiiunrooranbslsul.teiiWmtmfseoerwwatniirltdlahjbeprpcaitearofltrrityayialplolylrpy-ect-sicoemonnnsitztsaartart“aiionsinnaeedmpdprfiolnbebaodlleu”mnsvtdsaeartwrseyiisothn(cbofilaonxscdeeaidd-l Xk+1 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx S M onalgorithms from[8,16]). We assume that there is an available initial (but sub- optimal)feasiblecontrol/trajectorypair(ufeas(t),xfeas(t)) Fig.1.Illustrationoflocalpursuitonamanifold.Eachagent except the first one is trying to “catch up” its predecessor for (1), which could have been obtained through explo- rationorfroma-prioriknowledge.Agentsarescheduled before the predecessor reaches SQ, and trying to reach SQ viaalocally optimal trajectoryotherwise. to leave the starting state x0 sequentially, separated by a pursuit interval of ∆ time units, i.e., if the first (ini- Thetwoadjustableparametersinsampledlocalpursuit tial) agent leaves at time t0, the kth agent will leave at (SLP)algorithmarethepursuitinterval∆,whichdefines tk = t0+k∆,k = 0,1,2,.... The first agent is required how frequent agents depart from x0 (consequently how tofollowxfeas fromx0 toSQ.Thenextagentattempts farawayagentsareseparated),andtheupdatinginterval to intercept its predecessor along an optimal trajecto- δ, which defines the frequency with which each agent ries defined by (3), as long as the predecessor has not updates its trajectory. We will refer to two successive yet reached the target set SQ. If the predecessor is al- agents as a “pursuit pair”. Within a pair, the (k−1)th ready in SQ then the follower moves along the optimal agent will be known as the “leader”, and the kth agent 3 as the “follower”. We will refer to the times tik = tk + 3.1 ABriefReview ofMultipleShooting iδ,i = 0,1,2,3... as the “updating times”. The SLP algorithmemploystwotypesofpursuit–“catchingup” In principle, MS is a type of NLP method[1]. Suppose, and “free running” – depending on whether the leader fornow,thatweareinterestedinminimizingthescalar hasreachedthefinalconstraintsetSQornot.Theformer cost function F(x) : Rn → R, subject to m (m ≤ n) lets agents “learn” from their leaders, meanwhile the equality constraints c(x) = 0, where c(x) is an (m×1) latterenablesthemtofindtheoptimalfinalstate.Thisis vector. Todothis,wefirstintroducetheLagrangian: differentfromtheversionofsampledlocalpursuitin[8], which involves only “catching up” stages, and because T L(x,λ)=F(x)+λ c(x) (5) δ isfixed,ratherthandeterminedon-line(asisthecase in[16]),thetotalnumberofupdatesperformedbyeach Thenecessaryconditionsforthepoint(x∗,λ∗)tobean agent ispre-determined. optimum are satisfied by the stationary points of the Lagrangian 1: We now state the main result concerning the limiting behavior of the SLP algorithm. Before proceeding to ∇xL(x,λ)=∇xF(x)+∇xc(x)Tλ=0 (6) the main theorem, we must require that the cost of (2) changes “little” for small changes to the endpoints of a and trajectory: ∇λL(x,λ)=c(x)=0 (7) Condition 1 Assume for a generic trajectory x1(t) thereexistsanε>0suchthatforalla,b1,b2 ∈Dandall The equations (6)∼(7) are usually solved via Newton’s ∆ > 0, there exists a trajectory x2(t) such that the cost method. Proceeding formally we obtain the following C(x1,0,T)(x1(0)=a,x1(T)=b1)fromatob1andcost Karush-Kuhn-Tuckersystem: C(x2,0,T) (x2(0)=a,x2(T)=b2)fromatob2 satisfy (cid:6) (cid:7)(cid:6) (cid:7) (cid:6) (cid:7) (cid:6)b1−b2(cid:6)∞ <ε HL ∇xc(x)T ∆x −∇xF(x)−∇xc(x)Tλ ⇒(cid:6)C(x1,0,T)−C(x2,0,T)(cid:6)∞ <L∆ = ∇xc(x) 0 ∆λ −c(x) forsomeconstant Lindependent of∆. (8) Theorem 1 Suppose a group of agents evolve under where∆xisthe“search-direction”andHListheHessian sampled local pursuit, and that at every updating time oftheLagrangian ti,thelocallyoptimaltrajectoriesfromfollowertoleader k (cid:8)m are unique. If the updating interval δ and pursuit inter- val∆satisfy0<δ <∆andCondition1holds,thenthe HL =∇2xF(x)+ λi∇2xci (9) limiting trajectory obtained is unique and locally opti- i=1 mal. It is also smooth if the locally optimal trajectories calculatedateveryupdatingtimearesmooth. Forconvenience, wedefine (cid:6) (cid:7) HL ∇xc(x)T H = (10) PROOF. See[8,16]. ∇xc(x) 0 Suppose now that we are interested in optimizing (2) subjecttothedynamics 3 Local Pursuit as a Numerical Computational x˙ =f(x(t),u(t)). (11) Tool 1 Here,weusethenotation: Multiple-shooting, together with its various improved forms,couldbeconsideredaworkhorseofnumericalop- ∂a1 ∂a1 ··· ∂a1 timization. However, the large storage requirements as- ∂x1 ∂x2 ∂xn (cid:0) ∂a2 ∂a2 ··· ∂a2 (cid:3) sociatedwithMSlimittheso-called“permittingsize”of ∇xA(x)= ∂x1 ∂x2 ∂xn problemswhichcanbesolvedonadigitalcomputer[1]. (cid:1) ··· ··· ··· ··· (cid:4) (cid:1) (cid:4) In the following, we briefly review the technique known (cid:1)(cid:1) ∂an ∂an ··· ∂an (cid:4)(cid:4) as multiple shooting (MS) before showing how it im- (cid:1) ∂x1 ∂x2 ∂xn (cid:4) (cid:2) (cid:5) proves whencombined with SLP. 4 Doing so with MS (here we mainly refer to di- theprecisestateevolutionbyintegration.Second,using rect MS) involves breaking up the trajectory into linearapproximationreducescomputationalburden;on “shorter” pieces [1] by partitioning the time domain the other hand, higher-order approximations are likely [t0,tf] = ∪Ni=−11[ti,ti+1), t0 = t1 < ··· < tN = tf. Each to produce more precise results, but they will also give subinterval [ti,ti+1)iscalled a“segment”. risetomorecomplicated equations in(6)∼(13) andthe resultingNLPproblemwill requiremoretime tosolve. Multiple shooting uses NLP to find the optimal trajec- tory (i.e. to minimize a scalar function along a trajec- Oneshortcomingoflinearapproximationisthatthetime tory), by defining the NLP variables ν = [ν1,...,νN] separation between neighboring points must be limited to be the arguments – usually they are concatenations to a small time step h. If h is “too large”, then x¯i+1 of the states and the corresponding controls – at times will not be a good estimate of the true state of (11) as t1,...,tN along a trajectory. The stationary points it evolves from xi to xi+1 under ui, and NLP will pro- [ν∗,...,ν∗] obtained via NLP will approach points on duceerroneousresults.Theestimationerrorisofo(hn), 1 N the optimal trajectory. Without loss of generality, we wheren∈Nistheorderoftheapproximationmethods choose to sample the state and control vectors at the [10].Itisclearthatasmallerhresultsinbetterapproxi- segment endpoints,anddefinetheNLPvariables mation.Forthatreason,keepingthesegmentsizesmall is desirable and is a key feature of MS compared with ν ={x1,u1,x2,u2,...,xN,uN} (12) other single-stepshooting methods [1]. Notice that the dimension of xi and ui, i = 1,...,N is Let us assume that we can fix an acceptable h, i.e., Mx andMu,respectively. one that is small enough to lead to convergence for the associated NLP. Then the permitting size of problems AstheNLPvariablesareadjusted,onemustensurethat thatcanbesolvedbyMSdependssolelyonthenumber they satisfy the the system dynamics (11), and that se- of steps N required to cover the time span [0,T], with quentialtrajectorysegmentsobeyamatchingcondition T =(N −1)h.Thus,solvinglarge-scaleproblems(with at their boundaries. This means that the evolution of large time spans), requires a large number of segments (11)fromxi atti withinputui,shouldsteerthesystem N = Th+1. For problems with varying dynamics but to xi+1 at ti+1. This suggests that the following con- fixed time span, the requirement for N could be large straints arenecessary: because nonlinearities in the dynamics make it neces- sarytousesmalltimestepstoensurethattheestimates ⎡ ⎤ ⎢ x2−x¯1 ⎥ cx¯oinavreergperneccies.e enough and that the associated NLP will ⎢ ⎥ ⎢ x3−x¯2 ⎥ ⎢ ⎥ ⎢ . ⎥ Remark2 For any given set of dynamics and choice c(x)=⎢⎢ .. ⎥⎥=0 (13) of NLP method, the time step h is upper bounded. The ⎢ ⎥ ⎢xN −x¯N−1⎥ permitting size in MS is proportional to the number of ⎢ ⎥ segmentsN,ifhis pre-determined. ⎢ ⎥ ⎣ φ0(x1,t0) ⎦ φT(xN,tf) Supposenowthatthedimensionsofsystemstatexand control u are Mx and Mu respectively, and the number of segments is N, then the dimension of NLP variables wherethefunctionsφ0(x1,t0)andφT(xN,tf)represent the initial and final condition constraints, respectively, for a multiple shooting method is nx = (Mx +Mu)N, andx¯iaretobecalculatedbyintegratingthedifferential andtheNLPhasatleastMxN constraints(thenumber ofconstraintscouldbelargerifthefinalstatesarefixed). equation (11) fromti toti+1. The Hessian in (9) is of dimension nx ×nx = (Mx + Mu)2N2. To proceed with NLP, one must obtain the In practice, one often computes x¯i only approximately, solution of (8), which requires finding the solution to insteadofintegratingthecompleteequationsofmotion. a linear system of algebraic equations with dimension For example, Euler’s method provides a first-order ap- at least (Mu +2Mx)2N2. Here we have seen that the proximation totheintegration of(11): number of segments involved in the calculation affects the“degreeoflabor-consumption” withO(N2). x¯i+1 =xi+hf(xi,ui) (14) 3.2 NumericalOptimizationbyLocalPursuit where h is the time step of each segment. The choice ofapproximationshouldbebasedonthefollowingcon- siderations: First, we have sampled the controls at the As we have seen, the number of segments used in MS segment boundaries (the samples are to be optimized) affects the dimension of the associated NLP variables instead of considering the continuous-time control on and the computational complexity of the NLP problem the intervals [ti,ti+1]; therefore, we can not compute to be solved. For that reason, it would be desirable to 5 use fewer segments with the MS method. However, be- Table1 Comparing the dimensions of the NLP problem variables cause the time step h is upper bounded for a fixed set whenusingMultiple Shooting(MS)vs.Pursuit-basedMul- ofdynamicsandNLPalgorithm,decreasingthenumber tiple Shooting(PBMS). ofsegmentsmeansthatNLPprocesscanonlydealwith MS PBMS trajectories spanning shorter time intervals. This limi- tation can be circumvented by combining local pursuit dim(ν) (Mx+Mu)N (Mx+Mu)N∆ with direct MS, to obtain what we refer to as PBMS. Specifically, one can introduce a sequence of simulated dim(c(x)) MxN MxN∆ agentsthatpursueeachotherusingthealgorithmgiven dim(H) ((Mu+2Mx)N)2 ((Mu+2Mx)N∆)2 in Sec. 2. Each agent will use MS to compute the opti- mal trajectory from its own state to that of its leader, for MS, vs. N∆2 for PBMS. Operating on large matrices giving rise to a series of smaller NLP problems, whose consumeslargeamountsofmemory,especiallywhenus- timespanislimitedbythepursuitinterval∆.Although ingnon-iterativemethods,e.g.,thememoryofatypical therewillbemoreoftheseNLPproblemstosolve,their desktop PC can be easily used up by a matrix with di- lower dimension will make it possible to handle larger mension of 5000×5000 in Matlab when using 64 bits optimization problemsoverall. of digital precision. However, under PBMS, the matrix sizeisscaleddownbyafactorof(N∆/N)×(N∆/N)and hardwarerequirementscanbedecreasedsignificantlyfor 3.2.1 Decreasing the size of the NLP problems during anygiven problem. computation We note that under PBMS, each agent needs to solve Multiple Shooting XK+1 (N −N∆)/Nδ “smaller”MSproblemsinordertoreach thetargetsetSQ fromtheinitialstatex0,butthetime XK neededtodoso–denotedbyTa –willgenerallybeless thanthetotaliterativetimeinmultipleshooting,which we will denote by TMS. Of course, the total running timeforPBMS–denotedbyTLP –maybegreaterthan Local Pursuit TMS because local pursuit relies on multiple agents to converge to the optimum. Our experience in prior work XK+1 on local pursuit and in various numerical experiments (including the example in next Section) has been that, XK for well-conditioned problems, the convergence rate of PBMS is usually slower than that of MS. As expected, decreasing N∆ leads to slower convergence for PBMS. XK+2 However, the added running time comes with the ben- efit of lower memory requirements, allowing us to han- Fig. 2. Local pursuit could decrease the problem size in- dle problems with larger state vectors and longer time volvedincalculationateveryupdatingstep.Thenumberof horizons. At the same time, if N∆ is decreased and the variablescalculatedinmultipleshootingisN =13,andthe availablestorageisfixed,onecanaffordtoalsodecrease numberofvariablescalculatedbyeachagentinlocalpursuit thesegmentsizeh.Doingsohastheeffectofimproving is N∆ =5. thestateestimatesx¯i usedtoformulatetheNLP,with- For simplicity, fix h = T/(N −1) and select the pur- out making the problem ill-conditioned. This situation suit interval ∆ = (N∆ −1)h, N∆ ∈ N, where N∆ is will beillustrated inthenextSection. the number of segments within ∆. Usually we will have N∆ << N. For convenience, we also choose the updat- 3.2.2 Reducing numerical error when H is ill- ingintervalintheSLPalgorithmtobeanintegermulti- conditioned pleofsegmentsize,δ =Nδh, Nδ ∈Nsothattheagents are always updating their trajectories at the times ti Besidesmaximumproblemsize,anotherimportantcon- whichwechosetosamplethetrajectoryof(11).Ateach siderationinnumericaloptimalcontrol,isthecomputa- updating step t = iδ, each agent is solving a NLP over tional error introduced by the finite precision of digital N∆ segments,withatimespanof(N∆−1)hinsteadof computers and by algorithmic accuracy. It is possible (N−1)h,asillustratedinFig.2.Becausethecomputa- that the error is too large to obtain useful results, e.g., tional complexity of MS is related to the square of the solvingalinearsystemwithlargeconditionnumbercan number of segments, using N∆ << N will significantly result in unacceptable errors and non-convergence. In decrease the computational burden for each agent. Ta- such settings, correction algorithms, such as Tikhonov ble 1 shows the dimensions of the NLP vector, ν, the regularization,canbeapplied[10,12];theyare,however, constraints c(x), and the matrix H in Eq. (10), respec- timeandstorageconsuming,anddonotalwayssucceed. tively, for MS and PBMS. The dimension of the associ- atedKarush-Kuhn-TuckersystemisontheorderofN2 Fortheoptimalcontrolproblemofinterest,consider,for 6 example, using Gaussian elimination to solve (8) with 4 Example:AnOrbitTransferProblem limited digital precision. Every step of the elimination algorithmintroducessometruncationerror,sothatthe Consider an idealized spacecraft which must be trans- totalaccumulatederrorwhensolvingalargelinearsys- fered from one stable orbit2 to another, within some tem will generally be much larger than that associated fixed time T. For simplicity, we only consider the effect withsolvingoneoflowerdimension,becausethenumber of the Earth and restrict the problem to a plane, as il- ofstepsrequiredbythealgorithmisproportionaltothe lustratedininFig. 3. systemsize.Furthermore,ifH in(10)isill-conditioned, then the numerical solution of (8) introduces small er- rors which accumulate towards the final segment N. If I theproblem’stimehorizonislong, theaccumulated er- Spacecra ror may lead to erroneous results, or prevent MS from converging. PBMS can help reduce these numerical er- R rors because the algorithm’s simulated agents solve MS poowFffruho(tri8bhtch)hleeefomPorrmrBsigeowMiarniceSath,hliwapfagireleosvlhnbesotrlueyricmtscleore.reecTtddaihulmlwicyseehodiehmproateprniimlMdzieoatsSnlht,tfeharcraaioeltjmewedtcphitltaeloorbrdeyceidomscntaaeovtsnieetssrshfiigoaeiennst. xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxExxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxaxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxrxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxtxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxhxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx the convergence criteria of the numerical method used tosolvethe“short-range”MSproblemsbetweenleader- follower pairs, then the convergence of the agents’ tra- jectorysequenceisguaranteedbythelocalpursuitalgo- Fig.3.AplanarspacecraftinorbitaroundtheEarth. rithmitself. Thedynamics ofthissystemare 3.2.3 Remarks r¨= θ˙2 − uE + Pu1 r r2 mg In summary, the combination of MS with local pursuit θ¨=−2θ˙r˙ + Pu2 can increase the permitting size of problems that can r mgr be handled with fixed storage, because at every updat- P(u2+u2) m˙ =− 1 2 (15) ingstep,eachagentdealswithaproblemwith“reduced g2 size”.ThedevelopmentofNLPalgorithms(andofMS) u1=Isin(ϕ) hasfollowedthegrowthofthedigitalcomputer.Thesize ofatypicalapplicationintheearly1960swasn,m≈10, u2=Icos(ϕ) whileinthe1970sandearly1980smostapplicationwere of size as n,m<100. With subsequent advances in lin- where r is the distance between the spacecraft and the earalgebratechniques,suchasmatrixsparsity,andon- center of Earth, θ is its longitude with respect to the going progress in the semiconductor industry, the per- horizontal line, m is the mass of the spacecraft, uE is mitting size in late 1990s was n,m ≈ 10,000 [1]. How- the gravitational parameter of Earth, P is a constant ever, when using a fixed time partition, PBMS involves concerning engine power, and g is the acceleration of solvingproblemsofsizeN∆ insteadofN.Therefore,we gravity at sea level [17]. The control inputs, u1 and u2, can address much larger problems under the limits im- are functions of I and ϕ, the thrust force and thrust posed by the hardware. Although it requires more run- steering angle with respect to the tangent of the local ning time, PBMS does provide a feasible solution when orbit. thetraditionalformulationexceedsthoselimits,making itimpossible toproceed. We would like to minimize the fuel consumed (equiva- lently, tomaximize thefinalmassm(T)): (cid:1) In cases where MS fails to converge because the er- T 2 2 rors introduced by the approximation to (11) are too J = (u1(t) +u2(t) )dt great,PBMSmaysucceedbyreducingthesegmentsN∆, 0 thereby reducing the accumulated error over the tra- jectory of a single agent. In next section we present an 2 The term “stable orbit” means that the spacecraft will example of MS stagnancy caused by an ill-conditioned remainonthisorbitifnoexternalforce(otherthangravity) matrix H andhowPBMScanavoid theproblem. isapplied,i.e. θ˙= uE/r3. (cid:6) 7 whilesteeringthesystem(1(cid:15)5)fromtheinitialcondition 1E−8 (to guarantee little improvement with future it- r(0)=R0,r˙(0)=0,θ˙(0)= uE/R03,m(0)=(cid:15)M0 tothe erations)and(cid:6)c(x)(cid:6)≤(1E−15)×m,wheremwasthe final condition r(T) = RT,r˙(T) = 0,θ˙(T) = uE/RT3, dtoimryensasitoisnfieosf ctohnesstyrasitnetms (dtyongaumaircasn)t.eTehtehattottahlettirmaejecT- where T is fixed. For simplicity, we assumed there was was fixed to 300 minutes. The numerical properties of no upper bound of the thrust force, thus there is no theproblem–andconsequentlytheperformanceofthe restriction forthecontrolu1 andu2. twomethods–dependededontheselectionofthetotal numberofsegments N. 4.1 ApplyingMSvs.PBMS 4.2.1 Well-conditionedcase To solve the problem by MS, we used Euler’s method to estimate the piecewise integration of Eq. (11) and With N = 101, the matrix in Eq. (8) was well condi- imposedthefollowing constraints: tioned (can be verifiedby its condition number and the fast convergence rate for MS shown below). The per- 0=rj+1−r¯j formance of MS and PBMS are summarized in Table =rj+1−(rj +hjr˙j) 2, where c(H) was the condition number of matrix H; 0=r˙j+1−r¯˙j The “iterations” column lists the iteration numbers for =r˙j+1−(r˙j +hj(θr˙jj2 − urEj2 + Pmuj1gj)) cnoenedveerdgfeonrceloicnalmpuultrispulietsthoocootninvge,rgaen,drensupmecbteivreolyf.agents 0=θ˙j+1−θ¯˙j TCaobmlepa2rison between Multiple Shooting and Local Pursuit =θ˙j+1−(θ˙j +hj(−2θ˙rjjr˙j + mPjug2rjj)) inweNll=-c1o0n1ditionedcaMseS PBMS PBMS 0=mj+1−m¯j (N∆ =30, (N∆ =60, =mj+1−(mj −hjP(u21g+2 u22)) Nδ =16) Nδ =32) Comp.Time 66.8594 793.8594 430.2344 forj =1,2,3...,N −1and Iterations 14 277 41 0=r1−R0 0=r˙1−R˙0 m(T) 0.524246124 0.524245714 0.524246100 0=θ˙1−θ˙0 (cid:3)c(x)(cid:3) 3.7144E-14 9.2499E-15 9.9170E-15 0=m1−M0 Ave(c(H)) 1.5931E+6 4.8919E+5 5.3612E+5 0=rT −RT 0=r˙T −R˙T 0=θ˙T −θ˙T (16) Wecanseethatbothmethodsweresuccessfulandthat the convergence rate of PBMS was slower than that of where ri = r((i−1)h),r˙i = r˙((i −1)h),... and h = MS.IncreasingN∆resultedinincreasedneedforstorage T/(N − 1) was the time step. The dimension of con- and decreased running time. If N∆ = N, then PBMS straints c(x) (m as we denoted before) was (4×(N − reduces to MS. The final trajectories obtained by both 1)+7). TheNLPvariable methods areshowninFig.4. ν={r1,...,rN,r˙1,...,r˙N,θ˙1,...,θ˙N,m1,...,mN 4.2.2 Ill-conditionedcase ,u11,...,u1N−1,u21,...,u2N−1} (17) When the segment size was halved, i.e., N ≥ 201, MS wasofdimension (6×N −2). became stagnant because the matrix H in (10) was ill- conditioned. The error generated by solving Karush- Kuhn-Tucker system became so large that the NLP al- When applying local pursuit, the N should be replaced gorithm (using Newton’s method) was not able to con- by N∆ in the above equations. The dimension of the constraint set and the NLP variable were (4×(N∆ − verge. The large condition number of H (our criteria of 1)+7)and(6×N∆−2),respectively. ill-condition) was due to the large number of segments (infactH’sconditionnumberincreasedwiththesizeof segments,seeTable3).ForPBMS,thereductioninthe 4.2 Results numberofsegmentsforthesub-problemssolvedbypur- suingagentsmeantareductionintheconditionnumber WesolvedtheproblembothbyPBMSandstandardMS. ofH whenusingsmallN∆.Theresultsfrombothmeth- Thecriteriaforconvergencewere(cid:6)m(T)i+1−m(T)i(cid:6)≤ odsaresummarizedinTable3. 8