ebook img

Regularized decomposition methods for deterministic and stochastic convex optimization and application to portfolio selection with direct transaction and market impact costs PDF

0.85 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Regularized decomposition methods for deterministic and stochastic convex optimization and application to portfolio selection with direct transaction and market impact costs

REGULARIZED DECOMPOSITION METHODS FOR DETERMINISTIC AND STOCHASTIC CONVEX OPTIMIZATION AND APPLICATION TO PORTFOLIO SELECTION WITH DIRECT TRANSACTION AND MARKET IMPACT COSTS 7 1 0 VINCENT GUIGUES AND MIGUEL LEJEUNE AND WAJDI TEKAYA 2 n a J Abstract. We define a regularized variant of the Dual Dynamic Programming algorithm called 4 REDDP(REgularizedDualDynamicProgramming)tosolvenonlineardynamicprogrammingequa- 1 tions. We extend the algorithm to solve nonlinear stochastic dynamic programming equations. ] The corresponding algorithm, called SDDP-REG, can be seen as an extension of a regularization C of the Stochastic Dual Dynamic Programming (SDDP) algorithm recently introduced which was O studied for linear problems only and with less general prox-centers. We show the convergence of REDDP and SDDP-REG. We assess the performance of REDDP and SDDP-REG on portfolio . h models with direct transaction and market impact costs. In particular, we propose a risk-neutral t a portfolio selection model which can be cast as a multistage stochastic second-order cone program. m The formulation is motivated by the impact of market impact costs on large portfolio rebalancing operations. Numerical simulations show that REDDP is much quicker than DDP on all problem [ instances considered (up to 184 times quicker than DDP) and that SDDP-REG is quicker on the 1 instances of portfolio selection problems with market impact costs tested and much faster on the v instance of risk-neutral multistage stochastic linear program implemented (8.2 times faster). 1 4 Keywords: Stochastic Optimization, Stochastic Dual Dynamic Programming, Regularization, 9 3 Portfolio Selection, Market Impact Costs. 0 . 1 AMS subject classifications: 90C15, 90C90. 0 7 1. Introduction 1 : v Multistage stochastic optimization problems are used to model many real-life applications where i a sequence of decisions has to be made, subject to random costs and constraints arising from the X observations of a stochastic process. Solving such problems is challenging and often requires some r a assumptions on the underlying stochastic process, on the problem structure, and some sort of decomposition. In this paper, we are interested in problems for which deterministic or stochastic dynamic programming equations can be written. In this latter case, we will focus on situations where the underlying stochastic process is discrete interstage independent, the number of stages is moderate to large, and the state vector is of small size. Two popular solution methods to solve stochastic dynamic programming equations are Approxi- mateDynamicProgramming(ADP)[37]andStochasticDualDynamicProgramming(SDDP)[32], which is a sampling-based variant of the the Nested Decomposition (ND) algorithm [6, 7]. Several enhancements of SDDP have been proposed such as the extension to interstage dependent stochas- tic processes [24, 19], different sampling schemes [8], and recently the introduction and analysis of risk-averse variants [22, 23, 25, 34, 40, 41], cut selection strategies [33, 35], and convergence proofs ofthealgorithmforlinearproblemsin[36], fornonlinearrisk-neutralproblemsin[16], fornonlinear risk-averse problems in [20], and for linear problems without relatively complete recourse in [20]. However, a known drawback of the method is its convergence rate, making it difficult to apply to problems with moderate or large state vectors. To cope with this difficulty, a regularized variant 1 2 VINCENTGUIGUESANDMIGUELLEJEUNEANDWAJDITEKAYA of SDDP was recently proposed in [4] for Multistage Stochastic Linear Programs (MSLPs). This variant consists in computing in the forward pass of SDDP the trial points penalizing the objective withaquadratictermdependingonaprox-center(calledincumbentin[4])sharedbetweennodesof the same stage and updated at each iteration. On the tests reported in [4], the regularized method converges faster than the classical SDDP method on risk-neutral instances of MSLPs. On the basis of these encouraging numerical results, several natural extensions of this regularized variant can be considered: a) When specialized to deterministic problems, how does the regularized method behave? For such problems, how to extend the method when nonlinear objective and constraints are present and under which assumptions? Can we show the convergence of the method applied to these problems under these assumptions? b) How can the regularized algorithm be extended to solve Multistage Stochastic NonLinear Problems (MSNLPs) and under which assumptions? Can we show the convergence of this algorithm applied to MSNLPs under these assumptions? c) What other prox-centers and penalization schemes can be proposed? Find a MSLP for testing the new prox-centers and penalization schemes. Can we observe on this application a faster convergence of the regularized method, as for the application considered in [4]? d) Find a relevant application, modeled by a MSNLP, to test the regularized variant of SDDP. The objective of this paper is to study items a)-d) above. Our findings on these topics are as follows: a) REDDP: REgularized Dual Dynamic Programming. We propose a regularized vari- ant of Dual Dynamic Programming (DDP, the deterministic counterpart of SDDP) called REDDP, for nonlinear optimization problems. For REDDP, in Theorem 2.4, we show the convergence of the sequence of approximate first step optimal values to the optimal value of the problem and that any accumulation point of the sequence of trial points is an optimal solution of the problem. The same proof, with weaker assumptions (see Remark 2.5) can be used to show the convergence of this regularized variant of DDP applied to linear problems. We then consider instances of a portfolio problem with direct transaction costs with a largenumberofstagesandcomparethecomputationaltimerequiredtosolvetheseinstances with DDP and REDDP. In all experiments, the computational time was drastically reduced using REDDP. More precisely, we tested 6 different implementations of REDDP and for problems with T = 10,50,100,150,200,250,300, and 350 time periods, the range (for these 6 implementations) of the reduction factor of the overall computational time with REDDP was respectively [3.0,3.0],[13.8,17.3],[22.3,33.5],[37.1,65.0],[46.6,76.7],[80.0,114.3],[71.5, 171.6], and [95.5,184.4]. Since DDP (eventually with cut selection as in [21]) can already outperform direct solution methods (such as interior point methods or simplex) on some instances of large scale linear problems (see the numerical experiments in [21]), REDDP could be a competitive solution method to solve some large-scale problems, in particular linear, for which dynamic programming equations with convex value functions and a large number of time periods, can be written. b) SREDA:AStochasticREgularizedDecompositionAlgorithmtosolveMSNLPs. WedefineaStochasticREgularizedDecompositionAlgorithm(SREDA)forMSNLPswhich samples in the backward pass to compute cuts at trial points computed, as in [4], in a for- wardpass,penalizingtheobjectivewithaquadratictermdependingonaprox-centershared between nodes of the same stage. In Theorem 4.2, we show the convergence of this algo- rithm and observe in Remark 4.3 that the proof allows us to obtain the convergence of a regularized variant of SDDP called SDDP-REG applied to the nonlinear problems we are 3 interested in. More precisely, we show (i) the convergence of the sequence of the optimal values of the approximate first stage problems and that (ii) any accumulation point of the sequence of decisions can be used to define an optimal solution of the problem. It will turn out that (ii) improves already known results for SDDP. c) On prox-centers, penalization parameters, and on the performance of the regu- larization for MSLPs. We propose new prox-centers and penalization schemes and test them on risk-neutral and risk-averse instances of portfolio selection problems. d) Portfolio Selection with Direct Transaction and Market Impact Costs. The multistage optimization models studied in this paper are directly applicable in finance and inparticularfortherebalancingofportfoliosthatincurtransactioncosts. Transactioncosts canhaveamajorimpactontheperformanceofaninvestmentstrategy(see, e.g., thesurvey [11]). Two main types of transaction costs, implicit and explicit, can be distinguished. Explicit or direct transaction costs are directly observable (e.g., broker, custodial fees), are directly charged to the investor, and are generally modelled as linear [5, 12] or piecewise linear[9]. Inreality,itishowevernotpossibletotradearbitrarylargequantitiesofsecurities at their current theoretical market price. Implicit or indirect costs, often called market impact costs, result from imperfect markets due for example to market or liquidity restrictions (e.g., bid-ask spreads), depend on the order-book situation when the order is executed, and are not itemized explicitly, thereby making it difficult for investors to recognize them. Yet, for large orders, they are typically much larger than the direct transaction costs. Market impact costs are equal to the dif- ference between the transaction price and the (unperturbed) market price that would have prevailedifthetradehadnotoccurred[42,43,45]. Marketimpactcostsaretypicallynonlin- ear(see, e.g., [2,3,15,17,42]), andmuchmorechallengingtomodelthandirecttransaction costs. Market impact costs are particularly important for large institutional investors, for which they can represent a major proportion of the total transaction costs [28, 42]. They can be viewed as an additional price for the immediate execution of large trades. Thereisawidespreadinterestinthemodelingandanalysisofmarketimpactcostsasthey are(oneof)themainreduciblepartsofthetransactioncosts[28]. Inthisstudy,weproposea seriesofdynamic-deterministicandstochastic(risk-neutralandrisk-averse)-optimization models for portfolio optimization with direct transaction and market impact costs. We compare the computational time required to solve with SDDP-REG and SDDP in- stances of risk-neutral and risk-averse portfolio problems with direct transaction costs. We also compare the computational time required to solve with SDDP-REG and SDDP risk-neutral instances of portfolio problems with market impact costs using real data and T = 48 stages. To our knowledge, no dynamic optimization problem for portfolio optimiza- tion with conic market impact costs has been proposed so far. Also, we are not aware of other published numerical tests on the application of SDDP to a real-life application mod- elled by a multistage stochastic second-order cone program with a large (48 in our case, which already corresponds to a very challenging multistage stochastic nonlinear problem) number of stages. Thepaperisorganizedasfollows. InSection2, wepresentaclassofconvexdeterministicnonlin- ear optimization problems for which dynamic programming equations can be written. We propose the variant REDDP of DDP to solve these problems and show the convergence of this method in Theorem 2.4. Though this theorem is a special case of Theorem 4.2 given in Section 4, we thought it would be convenient for the reader to have this simpler proof in mind when considering the more 4 VINCENTGUIGUESANDMIGUELLEJEUNEANDWAJDITEKAYA complicated stochastic case since most arguments of the proof of Theorem 2.4 are re-used for the proof of Theorem 4.2. In Section 3, we introduce the type of stochastic nonlinear problems we are interested in and propose SREDA, a regularized decomposition algorithm to solve these problems. In Section 4, we show in Theorem 4.2 the convergence of SREDA. The portfolio selection models describedinitemd)abovearediscussedinSection5. Finally, thelastSection6presentstheresults of numerical simulations that illustrate our results. We show that REDDP is much quicker than DDPonallprobleminstancesconsidered(upto184timesquickerthanDDP)andthatSDDP-REG is quicker on the instances of nonlinear stochastic programs tested and much faster on the instance of risk-neutral multistage stochastic linear program implemented (8.2 times faster). We use the following notation and terminology: - The usual scalar product in Rn is denoted by (cid:104)x,y(cid:105) = xTy for x,y ∈ Rn. The corresponding (cid:112) norm is (cid:107)x(cid:107) = (cid:107)x(cid:107) = (cid:104)x,x(cid:105). 2 - ri(A) is the relative interior of set A. - B = {x ∈ Rn : (cid:107)x(cid:107) ≤ 1}. n - dom(f) is the domain of function f. - N (x) is the normal cone to A at x. A - AV@R is the Average Value-at-Risk with confidence level α, [38]. α - D(X) is the diameter of set X. 2. Regularized dual dynamic programming: Algorithm and convergence 2.1. Problem formulation and assumptions. Consider the problem (cid:26) min (cid:80)T f (x ,x ) (1) t=1 t t−1 t x ∈ X (x ), ∀t = 1,...,T, t t t−1 where X (x ) ⊂ X ⊂ Rn is given by t t−1 t X (x ) = {x ∈ X : A x +B x = b ,g (x ,x ) ≤ 0}, t t−1 t t t t t t−1 t t t−1 t f : Rn×Rn → R∪{+∞} is a convex function, g : Rn×Rn → Rp, and x is given. t t 0 Forthisproblem,wecanwritedynamicprogrammingequationsdefiningrecursivelythefunctions Q : X → R as t t−1 (2) Q (x ) := min{f (x ,x )+Q (x ) : x ∈ X (x )}, t = T,T −1,...,1, t t−1 t t−1 t t+1 t t t t−1 with the convention that Q ≡ 0. Clearly, Q (x ) is the optimal value of (1). More generally, T+1 1 0 we have   T (cid:88)  Q (x ) = min f (x ,x ) : x ∈ X (x ), ∀j = t,...,T . t t−1 j j−1 j j j j−1   j=t We make the following assumptions: setting (3) Xε := X +εB t t n (H0) For t = 1,...,T, (a) X ⊂ Rn is nonempty, convex, and compact. t (b) f is proper, convex, and lower semicontinuous. t (c) Setting g (x ,x ) = (g (x ,x ),...,g (x ,x )), for i = 1,...,p, the i-th component t t−1 t t,1 t−1 t t,p t−1 t function g (x ,x ) is a convex lower semicontinuous function. t,i t−1 t (d) There exists ε > 0 such that Xε ×X ⊂ dom(f ) and for every x ∈ Xε , there exists t−1 t t t−1 t−1 x ∈ X such that g (x ,x ) ≤ 0 and A x +B x = b . t t t t−1 t t t t t−1 t 5 (e) If t ≥ 2, there exists x¯ = (x¯ ,x¯ ) ∈ X ×ri(X )∩ri({g ≤ 0}) t t,t−1 t,t t−1 t t such that x¯ ∈ X , g (x¯ ,x¯ ) ≤ 0 and A x¯ +B x¯ = b . t,t t t t,t−1 t,t t t,t t t,t−1 t The DDP algorithm solves (1) exploiting the convexity of recourse functions Q : t Lemma 2.1. Consider recourse functions Q ,t = 1,...,T + 1, given by (2). Let Assumptions t (H0)-(a), (H0)-(b), (H0)-(c), and (H0)-(d) hold. Then for t = 1,...,T +1, Q is convex, finite on t Xε , and Lipschitz continuous on X . t−1 t−1 Proof: We give the idea of the proof. For more details, we refer to the proof of Proposition 3.1 in [20] where similar value functions are considered. The proof is by backward induction on t, starting with t = T +1 where the statement holds by definition of Q . Assuming for t ∈ {1,...,T} that T+1 Q is convex, finite on Xε, and Lipschitz continuous on X , then Assumptions (H0)-(a),(b), (c) t+1 t t imply the convexity of Q , and assumptions (H0)(a), (b), (d) that Q is finite on Xε and therefore t t t−1 Lipschitz continuous on X . t−1 The description of the subdifferential of Q given in the following proposition will be useful for t DDP, REDDP, and SREDA: Proposition 2.2. Lemma 2.1 in [20]. Let Asssumptions (H0) hold. Let x (x ) be an optimal t t−1 solution of (2). Then for every t = 2,...,T, for every x ∈ X , s ∈ ∂Q (x ) if and only if t−1 t−1 t t−1 (cid:110) (cid:111) (s,0) ∈ ∂ f (x ,x (x ))+ [AT;BT]ν : ν ∈ Rq xt−1 t t−1 t t−1 t t (cid:110) (cid:88) (cid:111) + µ ∂g (x ,x (x )) : µ ≥ 0 +{0}×N (x (x )) i t,i t−1 t t−1 i Xt t t−1 i∈I(xt−1,xt(xt−1)) (cid:110) (cid:111) where I(x ,x (x )) = i ∈ {1,...,p} : g (x ,x (x )) = 0 . t−1 t t−1 t,i t−1 t t−1 Proof: See [20]. 2.2. Dual Dynamic Programming. We first recall DDP method to solve (2). It uses relatively easy approximations Qk of Q . At iteration k, let functions Qk : X → R such that t t t t−1 (4) Qk = Q , Qk ≤ Q t = 2,3,...,T, T+1 T+1 t t be given and define for t = 1,2,...,T the function Qk : X → R as t t−1 (cid:110) (cid:111) Qk(x ) = min f (x ,x )+Qk (x ) : x ∈ X (x ) ∀x ∈ X . t t−1 t t−1 t t+1 t t t t−1 t−1 t−1 Clearly, (4) implies that Qk = Q , Qk ≤ Q t = 1,2,...,T −1. T T t t ItisassumedthatthefunctionsQk canbeevaluatedatanypointx ∈ X . TheDDPalgorithm t t−1 t−1 works as follows: 6 VINCENTGUIGUESANDMIGUELLEJEUNEANDWAJDITEKAYA DDP (Dual Dynamic Programming). Step 1) Initialization. Let Q0 : X → R ∪ {−∞}, t = 2,...,T + 1, satisfying (4) be t t−1 given. Set k = 1. Step 2) Forward pass. Setting xk = x , for t = 1,2,...,T, compute 0 0 (cid:110) (cid:111) (5) xk ∈ argmin f (xk ,x )+Qk−1(x ) : x ∈ X (xk ) . t t t−1 t t+1 t t t t−1 Step 3) Backward pass. Define Qk ≡ 0. For t = T,T −1,...,2, solve the problem T+1 (cid:110) (cid:111) (6) Qk(xk ) = min f (xk ,x )+Qk (x ) : x ∈ X (xk ) , t t−1 t t−1 t t+1 t t t t−1 using Proposition 2.2 take a subgradient βk of Qk(·) at xk , and store the new cut t t t−1 Ck(x ) := Qk(xk )+(cid:104)βk,x −xk (cid:105) t t−1 t t−1 t t−1 t−1 for Q , making up the new approximation Qk = max{Qk−1,Ck}. t t t t Step 4) Do k ← k+1 and go to Step 2). 2.3. Regularized Dual Dynamic Programming. For the regularized DDP to be presented in this section, we still define (cid:110) (cid:111) Qk(x ) = min Fk(x ,x ) : x ∈ X (x ) ∀x ∈ X , t t−1 t t−1 t t t t−1 t−1 t−1 where (7) Fk(x ,x ) = f (x ,x )+Qk (x ). t t−1 t t t−1 t t+1 t However, since the function Qk computed by regularized DDP is different from the function Qk t+1 t+1 computed by DDP, the functions Qk obtained with respectively regularized DDP and DDP are t different. The regularized DDP algorithm is given below: REgularized DDP (REDDP). Step 1) Initialization. Let Q0 : X → R ∪ {−∞}, t = 2,...,T + 1, satisfying (4) be t t−1 given. Set k = 1. Step 2) Forward pass. Setting xk = x , for t = 1,2,...,T, compute 0 0 (cid:110) (cid:111) (8) xk ∈ argmin F¯k−1(xk ,x ,xP,k) : x ∈ X (xk ) , t t t−1 t t t t t−1 where the prox-center xP,k is any point in X and where F¯k−1 : X ×X ×X → R t t t t−1 t t is given by F¯k−1(x ,x ,xP) = f (x ,x )+Qk−1(x )+λ (cid:107)x −xP(cid:107)2 t t−1 t t t t−1 t t+1 t t,k t t for some exogenous nonnegative penalization λ with λ = 0 if t = T or k = 1. t,k t,k Step 3) Backward pass. Define Qk ≡ 0. For t = T,T −1,...,2, solve the problem T+1 (cid:110) (cid:111) (9) Qk(xk ) = min f (xk ,x )+Qk (x ) : x ∈ X (xk ) , t t−1 t t−1 t t+1 t t t t−1 using Proposition 2.2 take a subgradient βk of Qk(·) at xk , and store the new cut t t t−1 Ck(x ) := Qk(xk )+(cid:104)βk,x −xk (cid:105) t t−1 t t−1 t t−1 t−1 for Q , making up the new approximation Qk = max{Qk−1,Ck}. t t t t Step 4) Do k ← k+1 and go to Step 2). Observe that the backward passes of the regularized and non-regularized DDP are the same. The algorithms differ from the way the trial points are computed: for regularized DDP a proximal term is added to the objective function of each period to avoid moving too far from the prox-center. 7 2.4. Convergence analysis. The following lemma will be useful to analyze the convergence of regularized DDP: Lemma 2.3. Let Assumptions (H0) hold. Then the functions Qk,t = 2,...,T+1,k ≥ 1, generated t by REDDP are Lipschitz continuous on Xε , satisfy Qk ≤ Q , and Qk(xk ) and βk are bounded t−1 t t t t−1 t for all t ≥ 2,k ≥ 1. Proof: It suffices to follow the proof of Lemma 3.2 in [20].1 Let us give the main steps of the proof whichisbybackwardinductionontstartingwitht = T+1wherethestatementholdsbydefinition of Q . Assuming for t ∈ {1,...,T} that Qk is Lipschitz continuous on Xε with Qk ≤ Q , T+1 t+1 t t+1 t+1 then Qk ≤ Q . Using Proposition 2.2, whose assumptions are satisfied because (H0)-(e) holds, we t t get Qk ≥ Ck and therefore Q ≥ Ck, Q ≥ Qk. Assumptions (H0)-(a)-(d) and finiteness of Q on t t t t t t t Xε imply that Qk(xk ) and βk are finite and allow us to obtain a uniform upper bound on βk, t−1 t t−1 t t i.e., a Lipschitz constant valid for all functions Qk,t = 2,...,T +1,k ≥ 1. t Theorem 2.4. Consider the sequences of decisions xk and approximate recourse functions Qk t t generated by REDDP. Let Assumptions (H0) hold and assume that for t = 1,...,T −1, we have lim λ = 0 and λ = 0 for every k ≥ 1. Then we have Q (xk) = Qk (xk), k→+∞ t,k T,k T+1 T T+1 T (10) Q (xk ) = Qk(xk ) = Qk(xk ), T T−1 T T−1 T T−1 and for t = 2,...,T −1, H(t) : lim Q (xk )−Qk(xk ) = lim Q (xk )−Qk(xk ) = 0. t t−1 t t−1 t t−1 t t−1 k→+∞ k→+∞ Also, (i) lim Qk(x ) = lim F¯k−1(x ,xk,xP,k) = Q (x ), the optimal value of (1), and k→+∞ 1 0 k→+∞ 1 0 1 1 1 0 (ii) any accumulation point (x∗,...,x∗) of the sequence (xk,...,xk) is an optimal solution of (1). 1 T 1 T k Proof: Since Qk = Q = 0, we have Q (xk) = Qk (xk). Next recall that Qk(xk ) ≤ T+1 T+1 T+1 T T+1 T T T−1 Q (xk ) and T T−1 Qk(xk ) ≥ Ck(xk ) = Qk(xk ) = Q (xk ) T T−1 T T−1 T T−1 T T−1 which shows (10). We prove H(t),t = 2,...,T − 1, by backward induction on t. We have just shown that H(T) holds. Assume that H(t+1) holds for some t ∈ {2,...,T −1}. We want to show that H(t) holds. To alleviate notation, we define the function F : X ×X → R given by t t−1 t (11) F (x ,x ) = f (x ,x )+Q (x ). t t−1 t t t−1 t t+1 t We will denote by x¯k an optimal solution of the problem defining Qk−1(xk ), i.e., t t t−1 (12) Qk−1(xk ) = Fk−1(xk ,x¯k). t t−1 t t−1 t By definition of Qk, we have that Qk(x ) ≥ Ck(x ) which implies Qk(xk ) ≥ Ck(xk ) = t t t−1 t t−1 t t−1 t t−1 Qk(xk ). We deduce that t t−1 (13) 0 ≤ Q (xk )−Qk(xk ) ≤ Q (xk )−Qk(xk ), t t−1 t t−1 t t−1 t t−1 ≤ Q (xk )−Qk−1(xk ) by monotonicity of (Qk) , t t−1 t t−1 t k = Q (xk )−Fk−1(xk ,x¯k) by definition of x¯k, t t−1 t t−1 t t = Q (xk )−Fk−1(xk ,xk)+Fk−1(xk ,xk)−Fk−1(xk ,x¯k). t t−1 t t−1 t t t−1 t t t−1 t 1In[20]aforward,insteadofaforward-backwardalgorithm,isconsidered. Inthissetting,finitenessofcoefficients Qk(xk )andβkisnotguaranteedforthefirstiterations(forinstanceQ1(x1 )are−∞aslongasthelowerbounding t t−1 t t t−1 functions Q0,t=2,...,T, are set to −∞) but the proof is similar. t 8 VINCENTGUIGUESANDMIGUELLEJEUNEANDWAJDITEKAYA Now observe that Q (xk )−Fk−1(xk ,xk) = Q (xk )−f (xk ,xk)−Qk−1(xk) by definition of Fk−1, t t−1 t t−1 t t t−1 t t−1 t t+1 t t (=11) Q (xk )−F (xk ,xk)+Q (xk)−Qk−1(xk), t t−1 t t−1 t t+1 t t+1 t (14) ≤ Q (xk)−Qk−1(xk), t+1 t t+1 t where the last inequality comes from the fact that xk ∈ X (xk ), i.e., xk is feasible for the opti- t t t−1 t mization problem defining Q (xk ) with optimal value Q (xk ) and objective function F (xk ,·). t t−1 t t−1 t t−1 The induction hypothesis gives (15) lim Q (xk)−Qk (xk) = 0. t+1 t t+1 t k→+∞ Since functions (Qk (·)) are Lipschitz-continuous on X , Q ≥ Qk ≥ Qk−1, and (xk) is a t+1 k t t+1 t+1 t+1 t k sequence of the compact set X , using Lemma A.1 in [16], (15) implies that t (16) lim Q (xk)−Qk−1(xk) = 0. t+1 t t+1 t k→+∞ Next, we have 0 ≤ Fk−1(xk ,xk)−Fk−1(xk ,x¯k) = Fk−1(xk ,xk)−F¯k−1(xk ,xk,xP,k) t t−1 t t t−1 t t t−1 t t t−1 t t +F¯k−1(xk ,xk,xP,k)−F¯k−1(xk ,x¯k,xP,k) t t−1 t t t t−1 t t +F¯k−1(xk ,x¯k,xP,k)−Fk−1(xk ,x¯k) t t−1 t t t t−1 t ≤ Fk−1(xk ,xk)−F¯k−1(xk ,xk,xP,k) t t−1 t t t−1 t t +F¯k−1(xk ,x¯k,xP,k)−Fk−1(xk ,x¯k), t t−1 t t t t−1 t wheretheaboveinequalitycomesfromthefactx¯k ∈ X (xk ),i.e.,x¯k isfeasiblefortheoptimization t t t−1 t problem (8) with objective function F¯k−1(xk ,·,xP,k) and optimal solution xk. We obtain t t−1 t t 0 ≤ Fk−1(xk ,xk)−Fk−1(xk ,x¯k) ≤ λ ((cid:107)x¯k −xP,k(cid:107)2−(cid:107)xk −xP,k(cid:107)2) t t−1 t t t−1 t t,k t t t t ≤ λ (cid:107)x¯k −xP,k(cid:107)2 ≤ λ D(X )2, t,k t t t,k t where D(X ) is the diameter of X (since X is compact D(X ) is finite), i.e., t t t t (17) lim Fk−1(xk ,xk)−Fk−1(xk ,x¯k) = 0. t t−1 t t t−1 t k→+∞ Combining (13), (14), (16), and (17), we obtain H(t). (i) Proceeding as above for t = 1, we obtain for Q (x )−Qk(x ) the bounds 1 0 1 0 (18) 0 ≤ Q (x )−Qk(x ) ≤ Q (x )−Qk−1(x ) ≤ Q (xk)−Qk−1(xk)+λ D(X )2. 1 0 1 0 1 0 1 0 2 1 2 1 1,k 1 Since H(2) holds, since functions (Qk(·)) are Lipschitz-continuous on X , Q ≥ Qk ≥ Qk−1, 2 k 1 2 2 2 and (xk) is a sequence of the compact set X , we obtain, using again Lemma A.1 in [16], 1 k 1 that lim Q (xk) − Qk−1(xk) = 0 and passing to the limit in (18) when k → +∞, we get k→+∞ 2 1 2 1 lim Qk(x ) = Q (x ). The above computations also show that k→+∞ 1 0 1 0 −λ D(X )2 ≤ Q (x )−Fk−1(x ,xk) ≤ Q (xk)−Qk−1(xk) 1,k 1 1 0 1 0 1 2 1 2 1 which implies that Q (x ) = lim Fk−1(x ,xk) = lim F¯k−1(x ,xk,xP,k). 1 0 k→+∞ 1 0 1 k→+∞ 1 0 1 1 (ii) Let (x∗,...,x∗) be an accumulation point of (xk,...,xk) and let K be an infinite set of 1 T 1 T k integers such that lim (xk,...,xk) = (x∗,...,x∗).2 Take now t ∈ {1,...,T}. Setting k∈K,k→+∞ 1 T 1 T 2Note that the existence of an accumulation point comes from the fact that (xk,...,xk) is a sequence of the 1 T k compact set X ×···X . 1 T 9 x∗ = x , from (10), (13), (18), and using the continuity of Q , we have 0 0 t Q (x∗ ) = lim Qk−1(xk ) = lim Fk−1(xk ,x¯k) (19) t t−1 k∈K,k→+∞ t t−1 k∈K,k→+∞ t t−1 t = lim Fk−1(xk ,xk)+Fk−1(xk ,x¯k)−Fk−1(xk ,xk). k∈K,k→+∞ t t−1 t t t−1 t t t−1 t We have shown that lim Fk−1(xk ,x¯k)−Fk−1(xk ,xk) = 0 which implies k∈K,k→+∞ t t−1 t t t−1 t Q (x∗ ) = lim Fk−1(xk ,xk). t t−1 t t−1 t k∈K,k→+∞ Using the continuity of Q , the fact that lim Q (xk)−Qk−1(xk) = 0, and the lower t+1 k∈K,k→+∞ t+1 t t+1 t semi-continuity of f , we obtain t (20) F (x∗ ,x∗) = f (x∗ ,x∗)+Q (x∗) ≤ lim Fk−1(xk ,xk) = Q (x∗ ). t t−1 t t t−1 t t+1 t t t−1 t t t−1 k∈K,k→+∞ Since g is lower semicontinuous, its level sets are closed, which implies that g (x∗ ,x∗) ≤ 0. t t t−1 t Recallingthatxk ∈ X withX closed, wehavethatx∗ isfeasiblefortheproblemdefiningQ (x∗ ). t t t t t t−1 Combiningthisobservationwith(20), wehaveshownthatx∗ isanoptimalsolutionfortheproblem t defining Q (x∗ ), i.e., problem (2) written for x = x∗ . This shows that (x∗,...,x∗) is an t t−1 t−1 t−1 1 T optimal solution to (1). If convergence of REDDP holds for any sequence (xP,k) of prox-centers in X and of penalty t k≥2 t parameters λ converging to zero for every t, the performance of the method depends on how t,k these sequences are chosen. DDP is obtained taking λ = 0 for every t,k. t,k For all numerical experiments of Section 6.2, REDDP was much faster than DDP. Some natural candidates for λ and xP,k, used in our numerical tests, are the following: t,k t • Weighted average of previous values: xP,k = 1 (cid:80)k−1γ xj with γ nonnegative t Γ j=1 t,k,j t t,k,j t,k weights and Γ = (cid:80)k−1γ . Note that xP,k ∈ X because all xj are in the convex t,k j=1 t,k,j t t t set X . Special cases include the average of previous values xP,k = 1 (cid:80)k−1xj and the t t k−1 j=1 t last trial point xP,k = xk−1 for t < T, k ≥ 2. t t • λ = ρk where 0 < ρ < 1 or λ = 1 for t < T, k ≥ 2. t,k t t t,k k2 Remark 2.5. If for a given stage t, X is a polytope and we do not have the nonlinear constraints t given by constraint functions g (i.e., the constraints for this stage are linear), then the conclusions t of Lemmas 2.1, 2.3, and Theorem 2.4 hold under weaker assumptions. More precisely, for such stages t, we assume (H0)-a), (H0)-(b), and instead of (H0)-(d), (H2)-(e), the weaker assumption (H0)-(c’): (H0)-(c’) There exists ε > 0 such that: (c’).1) Xε ×X ⊂ dom f ; t−1 t t (c’).2) for every x ∈ X , the set X (x ) is nonempty. t−1 t−1 t t−1 3. Regularized Stochastic Dual Dynamic programming 3.1. Problem formulation and assumptions. Consider a stochastic process (ξ ) where ξ is a t t discrete random vector with finite support containing in particular as components the entries in (b ,A ,B ) in a given order where b are random vectors and A ,B are random matrices. t t t t t t We denote by F the sigma-algebra σ(ξ ,...,ξ ) and by Z the set of F -measurable functions, t 1 t t t E : Z → Z is the conditional expectation at t. |Ft−1 t t−1 10 VINCENTGUIGUESANDMIGUELLEJEUNEANDWAJDITEKAYA With this notation, we are interested in solving problems of form (cid:18) inf f (x ,x ,ξ )+E inf f (x ,x ,ξ )+... x1∈X1(x0,ξ1) 1 0 1(cid:18)1 |F1 x2∈X2(x1,ξ2) 2 1 2 2 (21) +E inf f (x ,x ,ξ ) |FT−2 (cid:18)xT−1∈XT−1(xT−2,ξT−1) T−1 T−2(cid:19)(cid:19)T−1 (cid:19)T−1 +E inf f (x ,x ,ξ ) ... |FT−1 xT∈XT(xT−1,ξT) T T−1 T T for some functions f taking values in R∪{+∞}, where x is given and where t 0 (cid:110) (cid:111) X (x , ξ ) = x ∈ X : g (x ,x ,ξ ) ≤ 0, A x +B x = b t t−1 t t t t t−1 t t t t t t−1 t for some vector-valued function g and some nonempty compact convex set X ⊂ Rn. t t We make the following assumption on (ξ ): t (H1) (ξ )isinterstageindependentandfort = 2,...,T,ξ isarandomvectortakingvaluesinRK t t with discrete distribution and finite support Θ = {ξ ,...,ξ } while ξ is deterministic. t t,1 t,M 1 Toalleviatenotationandwithoutlossofgenerality,wehaveassumedthatthenumberM ofpossible realizations of ξ , the size K of ξ , and n of x do not depend on t. t t t Under Assumption (H1), E coincides with its unconditional counterpart E where E is the |Ft−1 t t expectation computed with respect to the distribution of ξ . To ease notation, we will drop the t index t in E . As a result, for problem (21), we can write the following dynamic programming t equations: we set Q ≡ 0 and for t = 2,...,T, define T+1 (cid:16) (cid:17) (22) Q (x ) = E Q (x ,ξ ) t t−1 t t−1 t with (23) (cid:40) inf F (x ,x ,ξ ) := f (x ,x ,ξ )+Q (x ) t t−1 t t t t−1 t t t+1 t Qt(xt−1,ξt) = xt x ∈ X (x ,ξ ) = {x ∈ X : g (x ,x ,ξ ) ≤ 0, A x +B x = b }. t t t−1 t t t t t−1 t t t t t t−1 t Problem (21) can then be written (cid:40) inf F (x ,x ,ξ ) := f (x ,x ,ξ )+Q (x ) 1 0 1 1 1 0 1 1 2 1 (24) x1 x ∈ X (x ,ξ ) = {x ∈ X : g (x ,x ,ξ ) ≤ 0,A x +B x = b }, 1 1 0 1 1 1 1 0 1 1 1 1 1 0 1 with optimal value denoted by Q (x ) = Q (x ,ξ ). 1 0 1 0 1 Recalling definition (3) of the the ε-fattening of a set, we make the following Assumption (H2) for t = 1,...,T: 1) X ⊂ Rn is nonempty, convex, and compact. t 2) For everyx ,x ∈ Rn the functionf (x ,x ,·) ismeasurable and for everyj = 1,...,M, t−1 t t t−1 t the function f (·,·,ξ ) is proper, convex, and lower semicontinuous. t t,j 3) For every j = 1,...,M, each component of the function g (·,·,ξ ) is a convex lower semi- t t,j continuous function. 4) There exists ε > 0 such that: 4.1) for every j = 1,...,M, Xε ×X ⊂ dom f (·,·,ξ ); t−1 t t t,j 4.2) for every j = 1,...,M, for every x ∈ Xε , the set X (x ,ξ ) is nonempty. t−1 t−1 t t−1 t,j 5) If t ≥ 2, for every j = 1,...,M, there exists x¯ = (x¯ ,x¯ ) ∈ X ×ri(X )∩ri({g (·,·,ξ ) ≤ 0}) t,j t,j,t−1 t,j,t t−1 t t t,j such that x¯ ∈ X (x¯ ,ξ ). t,j,t t t,j,t−1 t,j

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.