ebook img

Using the simulated annealing algorithm to solve the optimal control problem PDF

17 Pages·2012·0.41 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Using the simulated annealing algorithm to solve the optimal control problem

Using the simulated annealing algorithm to solve the optimal control problem 189 Using the simulated annealing algorithm to solve the optimal control 100 problem Using the simulated annealing algorithm Horacio Martínez-Alfaro to solve the optimal control problem HoracioMartínez-Alfaro [email protected] TecnológicodeMonterrey,CampusMonterrey México 1. Introduction A lot of research has been done in Automatic Control Systems during the last decade and morerecentlyindiscretecontrolsystemsduetothepopularuseofpowerfulpersonalcom- puters. This work presents an approach to solve the Discrete-Time Time Invariant Linear Quadratic(LQ)OptimalControlproblemwhichminimizesaspecificperformanceindex(ei- therminimumtimeand/orminimumenergy). Thedesignapproachpresentedinthispaper transformstheLQproblemintoacombinatorialoptimizationproblem. TheSimulatedAn- nealing(SA)algorithmisusedtocarryouttheoptimization. SimulatedAnnealingisbasicallyaniterativeimprovementstrategyaugmentedbyacriterion foroccasionallyacceptingconfigurationswithhighervaluesoftheperformanceindex(Mal- horta et al., 1991; Martínez-Alfaro & Flugrad, 1994; Martínez-Alfaro & Ulloa-Pérez, 1996; Rutenbar,1989). Givenaperformanceindex J(z)(analogtotheenergyofthematerial)and an initial configuration z , the iterative improvement solution is seeked by randomly per- 0 turbing z . TheMetropolisalgorithm(Martínez-Alfaro&Flugrad,1994;Martínez-Alfaro& 0 Ulloa-Pérez,1996;Rutenbar,1989)wasusedforacceptance/rejectionoftheperturbedconfig- uration. Inthisdesignapproach,SAwasusedtominimizedtheperformanceindexoftheLQproblem andasresultobtainingthevaluesofthefeedbackgainmatrixKthatmakestablethefeedback systemandminimizetheperformanceindexofthecontrolsysteminstatespacerepresenta- tion(Ogata,1995). TheSAalgorithmstartswithaninitialfeedbackgainmatrixKandeval- uatestheperformanceindex. Thecurrent K isperturbedtogenerateanother Knew andthe performanceindexisevaluated. Theacceptance/rejectioncriteriaisbasedontheMetropolis algorithm.Thisprocedureisrepeatedunderacoolingschedule.Someexperimentswereper- formedwithfirstthroughthirdorderplantsforRegulationandTracking,SingleInput-Single Output(SISO)andMultipleInput-MultipleOutput(MIMO)systems. MatlabandSimulink wereusedassimulationsoftwaretocarryouttheexperiments. TheparametersoftheSAalgorithm(perturbationsize,initialtemperature,numberofMarkov chains,etc.)werespeciallytunnedforeachplant. Additionalexperimentswereperformedwithnon-conventionalperformanceindicesfortrack- ing problems (Steffanoni Palacios, 1998) where characteristics like maximum overshoot max(y(k)−r(k)),manipulationsoftnessindex|u(k+1)−u(k)|,outputsoftnessindex|y(k+ 1)−y(k)|,andtheerrormagnitude|r(k)−y(k)|. www.intechopen.com 190 Simulated Annealing, Theory with Applications TheproposedschemewiththeuseoftheSAalgorithmshowedtobeanothergoodtoolfor discreteoptimalcontrolsystemsdesigneventhoughonlylineartimeinvariantplantswere considered (Grimble & Johnson, 1988; Ogata, 1987; 1995; Salgado et al., 2001; Santina et al., 1994). AlargeCPUtimewasinvolvedinthisschemeinordertoobtainsimilarresultstothe onesbyLQ.Thedesignprocessissimplifiedduetotheuseofgainmatricesthatgenerateasta- blefeedbacksystem. Theequationsrequiredarethoseuseforthesimulationofthefeedback systemwhichareverysimpleandveryeasytoimplement. 2. Methodology Theprocedureisdescribedasfollow: 1. ProposeainitialsolutionKinitial. 2. Evaluatetheperformanceindexandsaveinitialcost Jinitial(Kinitial). Kinitial needstobe convertedtomatricesK andK fortrackingsystems. 1 2 3. RandomlyperturbKinitialtoobtainaKnew. 4. EvaluatetheperformanceindexandsaveinitialcostJnew(Knew). 5. AcceptorrejectKnewaccordingtotheMetropoliscriterion. 6. Ifaccepted,Kinitial ←Knew,decrementtemperatureaccordingtoJnew/Jinitial. 7. Repeatfromstep3. OnceaMarkovchainiscompleted,decrementthetemperature, Ti+1 = αTi,whereTi repre- sentsthecurrenttemperatureandα =0.9(Martínez-Alfaro&Flugrad,1994). Theprocedure endswhenthefinaltemperatureoracertainnumberofMarkovchainshasbeenreached. 3. Implementation ThecodewasimplementedinMatlab,andthemodelsweredesignforRegulationandTrack- ing,SISOandMIMOsystems. Adiscreteoptimalcontrolsystemcanberepresentedasfollows(Ogata,1995): x(k+1)=Gx(k)+Hu(k) (1) y(k)=Cx(k)+Du(k) (2) wherex(n×1)isthestatevector,y(m×1)istheoutputvector,u(r×1)isthecontrolvector,G(n×n) isthestatematrix,H(n×r) istheinputmatrix,C(m×n) istheoutputmatrix,andD(m×r) isthe directtransmissionmatrix. Inan LQproblem thesolution determinesthe optimalcontrol sequence for u(k) that mini- mizestheperformanceindex(Ogata,1995). 3.1 Regulation TheequationthatdefinetheperformanceindexforaRegulatoris(Ogata,1987): J = 1 N∑−1[x′(k)Qx(k)+u′(k)Ru(k)] (3) 2 k=0 where Q(n×n) is positive definite or positive semidefinite Hermitian matrix, Q(n×n) is pos- itive definite or positive semidefinite Hermitian matrix, and N is the number of samples. Equation3representstheobjectivefunctionoftheSAalgorithm. www.intechopen.com Using the simulated annealing algorithm to solve the optimal control problem 191 3.2 Tracking Atrackingsystemcanberepresentedasfollows(Ogata,1987): x(k+1)=Gx(k)+Hu(k), u(k)=K v(k)−K x(k) 1 2 (4) y(k)=Cx(k), v(k)=r(k)−y(k)+v(k−1) wherexisthestatevector,uisthecontrolvector,yistheoutputvector,ristheinputreference vector,visthespeedvector,K istheintegralcontrolmatrix,K isthefeedbackmatrix,Gis 1 2 thestatematrix,Histheinputmatrix,andCistheoutputmatrix. TherepresentationusedinthisworkwasaRegulatorrepresentation(Ogata,1987): ξ(k+1)=Gˆ ξ(k)+Hˆ w(k), w(k)=−Kˆ ξ(k) (5) where: ξ(k)= xe(k) , Gˆ = G H (cid:31) ue(k) (cid:30) (cid:31) 0 0 (cid:30) 0 Hˆ = , Kˆ =(R+Hˆ′PˆHˆ)−1Hˆ′PˆGˆ, (6) (cid:31) Im (cid:30) [K2 K1]=(Kˆ +[0 Im])R(cid:31) GC−GIn CHH (cid:30)−1 andthestatesaredefinedas xe(k)=x(k)−x(∞), ue(k)=u(k)−u(∞) (7) Theperformanceindexis: ∞ J = 1 ∑ ξ′(k)Qˆ ξ(k)+w′(k)Rw(k) with Qˆ = Q 0 (8) 2 (cid:31) 0 0 (cid:30) k=0(cid:29) (cid:28) Sinceoursimultaionisfinite,theperformanceindexshouldbeevaluatedforNsamples: 1 N J = ∑ ξ′(k)Qˆ ξ(k)+w′(k)Rw(k) (9) 2 k=0(cid:29) (cid:28) 3.2.1 Non-conventionalperformanceindex Non-conventionalperformanceindexesarespeciallygoodwhenwedesiretoincludecertain outputand/orvectorcontrolcharacteristicsinadditiontotheonesprovidedbyastandard LQproblem. Theproposeperformanceindexis(SteffanoniPalacios,1998): N J =C ζ+C ϑ+C ϕ+ ∑ C ξ′(k)Qˆ ξ(k)+C w′(k)Rw(k)+C ε(k) (10) 1 2 3 4 5 6 k=0(cid:29) (cid:28) where • ζisthesoftnessindexofu(k)definedby|u(k+1)−u(k)|. • ϑisthemaximumovershootdefinedbymax(y(k)−r(k)). • ϕistheoutputsoftnessindexdefinedby|y(k+1)−y(k)|. • ε(k)istheerrordefinedby|r(k)−y(k)|. www.intechopen.com 192 Simulated Annealing, Theory with Applications • ξ(k)istheaugmentedstatevector. • w(k)istheaugmentedstate-inputvectorforthecontrollaw. • Qˆ andRaretheweightingmatricesforquadraticerror. • C,i = 1,...,6areweightingconstants. C yC take0or1valueswheatherornotto i 4 5 includethequadraticerror. ThisdescriptionisvalidonlyforSISOsystems.ThechangesforMIMOsystems(weconsider 200 justninputsandoutputs)are: 200 Softnessindexinvectoru(k) 150 e150 ζ =max(max(|ui(k+1)−ui(k)|), i=1,...,n) (11) peraturemperatur110000 Maximumovershoot me eT T 50 ϑ=max(max(y(k)−r(k)), i=1,...,n) (12) 50 i i Outputsoftnessindex 00 100 200 300 400 500 600 700 800 900 1000 1100 100 200 300 400 500 600 700 800 900 1000 1100 Accepted states ϕ=max(max(|y(k+1)−y(k)|), i=1,...,n) (13) Accepted states i i 250 Error 250 ε(k)=max(max(|r(k)−y(k)|), i=1,...,n) (14) i i T4.heESxApearlgimoreitnhtmsaisnbdasReedsounlttsheoneusedby(Martínez-Alfaro&Flugrad,1994). ance indexmance index 121250500000 FthoirsSwISoOrkswysetepmress,emntanjuystexthpeereixmpeenritmswenetrsewpeitrhfotrhmireddofrodrerrepglualnattso.rVaenrdytsriamckilianrgesxypsetreimmse.nItns PerformPerfor 110000 wereperformedwithMIMOsystems(regulatorandtracking),butweonlyworkwithtwo- 50 input-two-outputplants. 50 100 200 300 400 500 600 700 800 900 1000 1100 100 200 300 400 500 600 700 800 900 1000 1100 Accepted states Accepted states 4.1 SISOsystems 4.1.1 Regulator Thefollowingvaluesforathirdordersystemwere: 0 0 −0.25 1 1 0 0 G= 1 0 0 , H= 0 , Q= 0 1 0 , 0 1 0.5 1 0 0 1       −5 R=1, x(0)= 4.3  −6.8   TheSAalgorithmparameterswere: initialsolution= 0,maximumperturbation= 1,initial temperature=100,numberofMarkovchains=100,percentageofacceptance=80. TheSA algorithmfoundaJ =68.383218andLQaJ =68.367889.Table1showsthegains. Figure1,presentstheSAbehavior.Thestatesofbothcontrollersperformedsimilarly,Figure3; butwecanappreciatethatexistalittledifferencebetweenthem,Figure2. www.intechopen.com Using the simulated annealing algorithm to solve the optimal control problem 193 J K LQ J =68.367889 [−0.177028 −0.298681 −0.076100] SA J =68.383218 [−0.193591 −0.312924 −0.014769] Table1.Controllergains 200 200 150 e150 peraturemperatur110000 me eT T 50 50 0 0 100 200 300 400 500 600 700 800 900 1000 1100 100 200 300 400 500 600 700 800 900 1000 1100 Accepted states A(cac)epted states 250 250 ance indexmance index 121250500000 erformPerfor 100 P 100 50 50 100 200 300 400 500 600 700 800 900 1000 1100 100 200 300 400 500 600 700 800 900 1000 1100 Accepted states Accepted states (b) Fig.1.BehavioroftheSAalgorithm AccordingtoSection3.2,thetrackingsystemexperimentisnextwithisN=100. 0 1 0 0 0.5 G= 0 0 1 , H= 0 , CT = 1 , −.12 −.01 1 1 0       1 0 0 Q= 0 1 0 , R=10 0 0 1   www.intechopen.com 194 Simulated Annealing, Theory with Applications Error 0.06 0.04 0.02 0 (cid:1)0.02 (cid:1)0.04 (cid:1)0.06 Output 1.4 (cid:1)0.08 SA 0 2 4 6 8 10 12 14 16 18 20 LQ Fig.2.BehaviorofthestatedifferenceusingLQandSA. 1.2 SS u LttsaQaittenesgs using LQ 66 1 44 22 0.8 00 (cid:1)(cid:1)22 0.6 (cid:1)(cid:1)44 (cid:1)(cid:1)66 0.4 (cid:1)(cid:1)88 00 22 44 66 88 1100 1122 1144 1166 1188 2200 Fig.3.BehaviorofthestatesusingLQ. 0.2 SS u SttsaAaittenesgs using SA 66 0 44 0 2 4 6 8 10 12 14 16 18 20 22 00 (cid:1)(cid:1)22 (cid:1)(cid:1)44 (cid:1)(cid:1)66 (cid:1)(cid:1)88 00 22 44 66 88 1100 1122 1144 1166 1188 2200 Fig.4.BehaviorofstatesusingSA. www.intechopen.com Using the simulated annealing algorithm to solve the optimal control problem 195 Error 0.06 Yielding 0 1 0 0 0     0 0 1 0 0 0.04 Gˆ = −0.12 −0.01 1 1 , Hˆ = 0       0 0 0 0   1  0.02 TheSAalgorithmparameterswere:initialsolution=0,maximumperturbation=0.01,initial temperature=100,numberofMarkovchains=100,percentageofacceptance=80. LQob- tained a J = 2.537522 and SA a J = 2.537810. Although the indexes are very similar, gain 0 matricesdifferalittlebit(showninTable2).Figure6showsthestatesandFigure7theinput. K K (cid:1)0.02 1 2 LQ 0.290169 [−0.120000 0.063347 1.385170] SA 0.294318 [−0.107662 0.052728 1.402107] (cid:1)0.04 Table2.Controllergain (cid:1)0.06 Output 1.4 (cid:1)0.08 SA 0 2 4 6 8 10 12 14 16 18 20 LQ 1.2 SSttaatteess uussiinngg LLQQ 66 1 44 22 0.8 00 (cid:1)(cid:1)22 0.6 (cid:1)(cid:1)44 (cid:1)(cid:1)66 0.4 (cid:1)(cid:1)88 00 22 44 66 88 1100 1122 1144 1166 1188 2200 0.2 SSttaatteess uussiinngg SSAA 66 0 44 0 2 4 6 8 10 12 14 16 18 20 Fig.5.Output 22 00 (cid:1)(cid:1)22 4.1.2 Trackingwithnon-conventionalperformanceindex (cid:1)(cid:1)44 Several experiments were perform with type of index. This experiment was a third order (cid:1)(cid:1)66 plant, thesameofprevioussection. Thecoefficientvaluesfortheperformanceindexwere: C = 10, C = 10, C = 20, C = 1, C = 1, and C = 10. The SA algorithm parameters (cid:1)(cid:1)88 1 2 3 4 5 6 00 22 44 66 88 1100 1122 1144 1166 1188 2200 were: initialsolution= 0,maximumperturbation= 1,initialtemperature= 100,numberof Markovchains= 100,andthepercentageofacceptance= 80. SAobtaineda J = 46.100502, withK =0.383241,andK =[−0.108121, 0.189388, 1.424966]. Figure8showstheresponse 1 2 ofthesystemandFigures9and10showthestatesandinput,respectively. www.intechopen.com 196 Simulated Annealing, Theory with Applications x(k) LQ x(k) LQ u(k) LQ u(k) LQ 0.7 x(k0). 7LQ x(k) LQ 0.35 0u.3(k5) LQ u(k) LQ 0.7 0.7 0.35 0.35 0.6 0.6 0.3 0.3 0.6 0.6 0.3 0.3 0.5 0.5 0.25 0.25 0.5 0.5 0.25 0.25 0.4 0.4 0.2 0.2 0.4 0.4 0.2 0.2 0.3 0.3 0.15 0.15 0.3 0.3 0.15 0.15 0.2 0.2 0.1 0.1 0.2 0.2 0.1 0.1 0.1 0.1 0.05 0.05 0.1 0.1 0.05 0.05 0 0 0 0 00 5 1000 155 2010 15 00 20 5 0100 155 1200 15 20 0 5 100 155 2010 15 0 20 5 100 155 1200 15 20 (a) LQ x(k) SA x(k) SA u(k) SA u(k) SA 0.7 x(k0) .S7A x(k) SA 0.35 0u.(3k5) SA u(k) SA 0.7 0.7 0.35 0.35 0.6 0.6 0.3 0.3 0.6 0.6 0.3 0.3 0.5 0.5 0.25 0.25 0.5 0.5 0.25 0.25 0.4 0.4 0.2 0.2 0.4 0.4 0.2 0.2 0.3 0.3 0.15 0.15 0.3 0.3 0.15 0.15 0.2 0.2 0.1 0.1 0.2 0.2 0.1 0.1 0.1 0.1 0.05 0.05 0.1 0.1 0.05 0.05 0 0 0 0 00 5 1000 155 2010 15 00 20 5 0100 155 1200 15 20 0 5 (b)1SA00 155 2010 15 0 20 5 100 155 1200 15 20 Fig.6.Statesbehavior Unit step response 1.4 4.2 MIMOsystems 4.2.1 Regulator 1.2 Thesystemusedwas: 3.5 0.5 0.5 1 0 0 G= 1 2.5 0 , H= 1 0 0 , Q= 0 1 0 , 1 (cid:27) 0 1 0 (cid:26) 1.5 −1 4 0 0 1     0.8 5 1 0 R= , x(0)= −1  (cid:27) 0 1 (cid:26) 0.6 3   TheSAalgorithmparameterswere: initialsolution= 0,maximumperturbation= 5,initial 0.4 temperature = 100, number of Markov chains = 100, and percentage of acceptance = 80. LQ obtained a J = 732.375702 and SA a J = 733.428460. Gain matrices are very similar. Figure11showsthestates. 0.2 0 0 10 20 30 40 50 60 70 80 90 100 www.intechopen.com Using the simulated annealing algorithm to solve the optimal control problem 197 x(k) LQ x(k) LQ u(k) LQ u(k) LQ 0.7 x(k0). 7LQ x(k) LQ 0.35 0u.3(k5) LQ u(k) LQ 0.7 0.7 0.35 0.35 0.6 0.6 0.3 0.3 0.6 0.6 0.3 0.3 0.5 0.5 0.25 0.25 0.5 0.5 0.25 0.25 0.4 0.4 0.2 0.2 0.4 0.4 0.2 0.2 0.3 0.3 0.15 0.15 0.3 0.3 0.15 0.15 0.2 0.2 0.1 0.1 0.2 0.2 0.1 0.1 0.1 0.1 0.05 0.05 0.1 0.1 0.05 0.05 0 0 0 0 00 5 1000 155 2010 15 00 20 5 0100 155 1200 15 20 0 5 100 155 2010 15 0 20 5 100 155 1200 15 20 (a) LQ x(k) SA x(k) SA u(k) SA u(k) SA 0.7 x(k0) .S7A x(k) SA 0.35 0u.(3k5) SA u(k) SA 0.7 0.7 0.35 0.35 0.6 0.6 0.3 0.3 0.6 0.6 0.3 0.3 0.5 0.5 0.25 0.25 0.5 0.5 0.25 0.25 0.4 0.4 0.2 0.2 0.4 0.4 0.2 0.2 0.3 0.3 0.15 0.15 0.3 0.3 0.15 0.15 0.2 0.2 0.1 0.1 0.2 0.2 0.1 0.1 0.1 0.1 0.05 0.05 0.1 0.1 0.05 0.05 0 0 0 0 00 5 1000 155 2010 15 00 20 5 0100 155 1200 15 20 0 5 100 155 2010 15 0 20 5 100 155 (b) S1A200 15 20 Fig.7.Inputbehavior Unit step response 1.4 1.2 1 0.8 0.6 0.4 0.2 0 0 10 20 30 40 50 60 70 80 90 100 Fig.8.TrackingwithNon-conventionalindex:unitstepresponse. www.intechopen.com 198 Simulated Annealing, Theory with Applications States using LQ State vector x(k) State vector x(k) 5 0.8 States using LQ 0.8 5 0.6 0.6 0 0 0.4 0.4 (cid:1)5 0.2 (cid:1)5 0.2 (cid:1)10 0 0 2 4 6 8 10 12 14 16 18 20 00 10 20 30 40 50 60 70 80 90 100 (cid:1)10 0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 12 14 16 18 20 Fig.9.TrackingwithNon-conventionalindex:statesbehavior. States using SA Input vector u(k) Input vector u(k) 5 0.4 States using SA 0.4 5 0.3 0.3 0 0 0.2 0.2 (cid:1)5 0.1 (cid:1)5 0.1 0 (cid:1)10 00 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 12 14 16 18 20 (cid:1)10 Fig.10.Trac0kingw10ithNo2n0-conv3e0ntiona4l0index5:0input6b0ehavi7o0r. 80 90 100 0 2 4 6 8 10 12 14 16 18 20 4.2.2 Tracking Thenumberofsampleswas100. −1 0 0 2 3 4 −1 G= 03 1 0 , H= 1 0 , CT = 1 0 , 2 0 0 1 0 1 0 1  2      1 0 0 1 0 Q= 0 1 0 , R= (cid:27) 0 1 (cid:26) 0 0 1   www.intechopen.com

Description:
This work presents an approach to solve the Discrete-Time Time Invariant A discrete optimal control system can be represented as follows (Ogata,
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.