ebook img

Finite Approximations in Discrete-Time Stochastic Control PDF

196 Pages·2018·1.97 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Finite Approximations in Discrete-Time Stochastic Control

Systems & Control: Foundations & Applications Naci Saldi · Tamás Linder Serdar Yüksel Finite Approximations in Discrete-Time Stochastic Control Quantized Models and Asymptotic Optimality Systems & Control: Foundations & Applications SeriesEditor TamerBas¸ar,UniversityofIllinoisatUrbana-Champaign,Urbana,IL, USA EditorialBoard KarlJohanÅström,LundUniversityofTechnology,Lund,Sweden Han-FuChen,AcademiaSinica,Beijing,China BillHelton,UniversityofCalifornia,SanDiego,CA,USA AlbertoIsidori,SapienzaUniversityofRome,Rome,Italy MiroslavKrstic,UniversityofCalifornia,SanDiego,CA,USA H.VincentPoor,PrincetonUniversity,Princeton,NJ,USA MeteSoner,ETHZürich,Zürich,Switzerland; SwissFinanceInstitute,Zürich,Switzerland FormerEditorialBoardMember RobertoTempo,(1956–2017†),CNR-IEIIT,PolitecnicodiTorino,Italy Moreinformationaboutthisseriesathttp://www.springer.com/series/4895 Naci Saldi (cid:129) Tamás Linder (cid:129) Serdar Yüksel Finite Approximations in Discrete-Time Stochastic Control Quantized Models and Asymptotic Optimality NaciSaldi TamásLinder DepartmentofNaturalandMathematical DepartmentofMathematicsandStatistics Sciences Queen’sUniversity OzyeginUniversity Kingston,Ontario,Canada Istanbul,Turkey SerdarYüksel DepartmentofMathematics&Statistics Queen’sUniversity Kingston,Ontario,Canada ISSN2324-9749 ISSN2324-9757 (electronic) Systems&Control:Foundations&Applications ISBN978-3-319-79032-9 ISBN978-3-319-79033-6 (eBook) https://doi.org/10.1007/978-3-319-79033-6 LibraryofCongressControlNumber:2018939290 MathematicsSubjectClassification(2010):93E20,90C40,60J20,49J55,90B99,60J05 ©SpringerInternationalPublishingAG,partofSpringerNature2018 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof thematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation, broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformation storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology nowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthorsandtheeditorsaresafetoassumethattheadviceandinformationinthisbook arebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsor theeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinorforany errorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregardtojurisdictional claimsinpublishedmapsandinstitutionalaffiliations. Printedonacid-freepaper This book is published under the imprint Birkhäuser, www.birkhauser-science.com by the registered companySpringerInternationalPublishingAGpartofSpringerNature. Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland Contents 1 IntroductionandSummary................................................. 1 1.1 IntroductionandMotivation............................................ 1 1.2 TheQuantizationApproachPresentedinThisBook.................. 2 1.2.1 MarkovDecisionProcessesandaBriefLiterature ReviewofApproximationMethods ........................... 2 1.2.2 DecentralizedStochasticControl .............................. 5 1.2.3 WhatThisBookPresents:AUnifiedQuantization ApproachUnderRelaxedRegularityConditions ............. 7 1.3 OrganizationoftheBook............................................... 8 1.4 NotationandConventions .............................................. 10 PartI FiniteModelApproximationsinStochasticControl 2 PreludetoPartI.............................................................. 15 2.1 MarkovDecisionProcesses ............................................ 15 2.2 PartiallyObservedMarkovDecisionProcesses....................... 18 2.3 ConstrainedMarkovDecisionProcesses .............................. 20 3 Finite-ActionApproximationofMarkovDecisionProcesses ........... 23 3.1 Introduction ............................................................. 23 3.2 QuantizingtheActionSpace ........................................... 24 3.2.1 FiniteActionModel ............................................ 24 3.3 NearOptimalityofQuantizedPoliciesUnderStrongContinuity .... 25 3.3.1 DiscountedCost ................................................ 26 3.3.2 AverageCost.................................................... 27 3.4 NearOptimalityofQuantizedPoliciesUnderWeakContinuity ..... 31 3.4.1 DiscountedCost ................................................ 32 3.4.2 AverageCost.................................................... 36 3.5 RatesofConvergence................................................... 40 3.5.1 DiscountedCost ................................................ 41 3.5.2 AverageCost.................................................... 42 3.5.3 OrderOptimality................................................ 43 v vi Contents 3.6 Proofs.................................................................... 45 3.7 ConcludingRemarks.................................................... 48 4 Finite-StateApproximationofMarkovDecisionProcesses............. 49 4.1 Introduction ............................................................. 49 4.1.1 AuxiliaryResults ............................................... 50 4.2 Finite-StateApproximationofCompactStateMDPs................. 51 4.2.1 DiscountedCost ................................................ 52 4.2.2 AverageCost.................................................... 58 4.3 Finite-StateApproximationofNon-compactStateMDPs............ 64 4.3.1 DiscountedCost ................................................ 65 4.3.2 AverageCost.................................................... 72 4.4 DiscretizationoftheActionSpace..................................... 78 4.5 RatesofConvergenceforCompact-StateMDPs...................... 79 4.5.1 DiscountedCost ................................................ 80 4.5.2 AverageCost.................................................... 84 4.5.3 OrderOptimality................................................ 86 4.6 NumericalExamples.................................................... 88 4.6.1 AdditiveNoiseSystem ......................................... 88 4.6.2 FisheriesManagementProblem................................ 89 4.7 Proofs.................................................................... 91 4.8 ConcludingRemarks.................................................... 97 5 ApproximationsforPartiallyObservedMarkovDecision Processes...................................................................... 99 5.1 Introduction ............................................................. 99 5.2 ContinuityPropertiesoftheBelief-MDP.............................. 102 5.2.1 StrongandWeakContinuityProperties ....................... 102 5.2.2 Finite-ActionModel............................................ 105 5.2.3 Finite-StateModel.............................................. 106 5.3 QuantizationoftheBeliefSpace....................................... 109 5.3.1 ConstructionwithFiniteX ..................................... 109 5.3.2 ConstructionwithCompactX.................................. 110 5.3.3 ConstructionwithNon-compactX............................. 111 5.3.4 ConstructionforSpecialModelsLeadingtoQuantized BeliefswithContinuousSupport .............................. 113 5.4 NumericalExamples.................................................... 116 5.4.1 ExamplewithFiniteX.......................................... 116 5.4.2 ExamplewithCompactX...................................... 118 5.5 ConcludingRemarks.................................................... 122 6 ApproximationsforConstrainedMarkovDecisionProblems.......... 125 6.1 Introduction ............................................................. 125 6.2 ConstrainedMarkovDecisionProcesses .............................. 126 6.2.1 Finite-StateModel.............................................. 127 Contents vii 6.3 ApproximationofDiscountedCostProblems......................... 128 6.3.1 ApproximationofValueFunction ............................. 128 6.3.2 ApproximationofOptimalPolicy ............................. 140 6.4 ApproximationofAverageCostProblems ............................ 144 6.5 ConcludingRemarks.................................................... 148 PartII FiniteModelApproximationsinDecentralizedStochastic Control 7 PreludetoPartII ............................................................ 153 7.1 Witsenhausen’sIntrinsicModel........................................ 153 7.2 StaticReductionofSequentialDynamicTeams....................... 155 8 FiniteModelApproximationsinDecentralizedStochasticControl.... 159 8.1 Introduction ............................................................. 159 8.1.1 AuxiliaryResults ............................................... 159 8.2 ApproximationofStaticTeamProblems .............................. 160 8.2.1 Approximation of Static Teams with Compact ObservationSpacesandBoundedCost........................ 161 8.2.2 ApproximationofStaticTeamswithNon-compact ObservationSpacesandUnboundedCost..................... 164 8.3 ApproximationofDynamicTeamProblems .......................... 172 8.3.1 ApproximationofDynamicTeamsAdmittingaStatic Reduction ....................................................... 172 8.4 DiscretizationoftheActionSpaces.................................... 173 8.5 OnRatesofConvergenceinQuantizedApproximations............. 174 8.6 ConcludingRemarks.................................................... 175 9 AsymptoticOptimalityofFiniteModelsforWitsenhausen’s CounterexampleandBeyond............................................... 177 9.1 Introduction ............................................................. 177 9.2 Witsenhausen’sCounterexample....................................... 177 9.2.1 Witsenhausen’sCounterexampleandItsStaticReduction ... 177 9.2.2 ApproximationofWitsenhausen’sCounterexample.......... 179 9.2.3 Asymptotic Optimality of Finite Models for Witsenhausen’sCounterexample............................... 182 9.3 TheGaussianRelayChannelProblem................................. 183 9.3.1 TheGaussianRelayChannelProblemandItsStatic Reduction ....................................................... 183 9.3.2 ApproximationoftheGaussianRelayChannelProblem..... 184 9.3.3 AsymptoticOptimalityofFiniteModelsforGaussian RelayChannelProblem ........................................ 187 9.4 ConcludingRemarks.................................................... 187 References......................................................................... 189 Index............................................................................... 197 Chapter 1 Introduction and Summary 1.1 Introductionand Motivation Control and optimization of dynamical systems in the presence of stochastic uncertainty is a mature field with a large range of applications. A comprehensive treatment of such problems can be found in excellent books and other resources including[7,16,29,68,84,95,104],and[6].Todate,thereexistsanearlycomplete theory regarding the existence and structure of optimal solutions under various formulationsaswellascomputationalmethodstoobtainsuchoptimalsolutionsfor problems withfinitestateandcontrol spaces.However, there stillexistsubstantial computational challenges involving problems with large state and action spaces, suchasstandardBorelspaces.Forsuchstateandactionspaces,obtainingoptimal policiesisingeneralcomputationallyinfeasible. An increasingly important avenue of research is what is often referred to as decentralized stochastic control or dynamic team theory, which involves multiple decision makers who strive for a common goal but who have access only to localinformation.Informationstructuresinadecentralizedcontrolproblemspecify whichdecisionmakerhasaccesstowhatinformationandtheimpactsofadecision maker’s action on other decision maker’s information. We will present a concise overviewofinformationstructuresinthebook,butreferthereadertomoreextensive resourcessuchas[148].Decentralized controlproblemsmaybequitechallenging under certain information structures; in general few results are known regarding systematicmethodstoarriveatoptimalsolutions,andthereexistproblems(suchas Witsenhausen’scounterexample)whichhavedefiedsolutionattemptsformorethan 40years. Quantization, as we will demonstrate, provides a systematic constructive approach for obtaining approximately optimal solutions with guaranteed convergence properties for both classical stochastic control problems and decentralized stochastic control problems. The aim of this book is to present ©SpringerInternationalPublishingAG,partofSpringerNature2018 1 N.Saldietal.,FiniteApproximationsinDiscrete-TimeStochasticControl, Systems&Control:Foundations&Applications, https://doi.org/10.1007/978-3-319-79033-6_1 2 1 IntroductionandSummary our recent results regarding the approximation problem in a unified form. Our approach is to establish the convergence of approximate models obtained through quantization with a rigorous treatment involving stochastic control, probability theory,andinformationtheory. Additional motivation for our approach comes from networked control appli- cations. In networked control, the transmission of real valued control actions to an actuator is not realistic when there is an information transmission constraint (physicallylimitedbythepresenceofacommunicationchannel)betweenaplant,a controller,oranactuator;inthiscasetheactionsofacontrollermustbequantized inordertobereliablytransmittedtoanactuator.Hence,thereisapracticalneedto approximatecontrolpoliciesbypolicieswhicharecomputableandtransmittable. 1.2 The QuantizationApproachPresentedinThis Book 1.2.1 MarkovDecisionProcessesandaBriefLiterature ReviewofApproximationMethods A discrete-time Markov decision process (MDP) is a mathematical model for sequential decision making under stochastic uncertainty that proved useful in modeling a wide range of systems in engineering, economics, and biology (see [44,68]).AnMDPcanbespecifiedbythefollowingcomponents:(i)thestatespace XandtheactionspaceA,(ii)thetransitionprobabilityp(·|x,a)onXgivenX×A which gives the probability of the next state given the current state-action pair is (x,a), (iii) one-stage cost functions c : X×A → R, t = 0,1,2,... (in general t c =cforsomec:X×A→R),and(iv)theinitialdistributionμonX. t IfX andA denotethestateandactionvariablesattimestept,thenwiththese t t definitions,wehave (cid:2) (cid:3) Pr X ∈ · =μ(·) 0 (cid:2) (cid:4) (cid:3) (cid:2) (cid:4) (cid:3) Pr Xt+1 ∈ ·(cid:4)X{0,t},A{0,t} =Pr Xt+1 ∈ ·(cid:4)Xt,At =p(·|Xt,At), t =1,2,... where X{0,t} = (X0,...,Xt) and A{0,t} = (A0,...,At). In this model, at each time step t, the decision maker observes the state of the system X and chooses an action A , using a decision function (control t t (cid:5)policy) πt, depending on the (cid:6)observation obtained up to that time X0,A0,X1,A1,...,At−1,Xt−1,Xt . The action can be a selection of a point from the action set, i.e., πt(X0,A0,X1,A1,...,At−1,Xt−1,Xt) = At (deterministic policy), or a selection of a probability measure (cid:2)over an ac(cid:4)tion set, i.e., πt(X0,A0,X1,A1(cid:3),...,At−1,Xt−1,Xt) = Pr At ∈ ·(cid:4)X0,A0,X1,A1,...,At−1,Xt−1,Xt (randomized policy). The effect of choosing an action at time t is twofold: an immediate cost c (X ,A ) is incurred t t t and the state of the system evolves to a new state probabilistically according to

Description:
In a unified form, this monograph presents fundamental results on the approximation of centralized and decentralized stochastic control problems, with uncountable state, measurement, and action spaces. It demonstrates how quantization provides a system-independent and constructive method for the red
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.