Table Of Content

Stochastic dynamic programming : successive approximations and nearly optimal strategies for Markov decision processes and Markov games Citation for published version (APA): Wal, van der, J. (1980). Stochastic dynamic programming : successive approximations and nearly optimal strategies for Markov decision processes and Markov games. [Phd Thesis 1 (Research TU/e / Graduation TU/e), Mathematics and Computer Science]. Stichting Mathematisch Centrum. https://doi.org/10.6100/IR144733 DOI: 10.6100/IR144733 Document status and date: Published: 01/01/1980 Document Version: Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers) Please check the document version of this publication: • A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal. If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement: www.tue.nl/taverne Take down policy If you believe that this document breaches copyright please contact us at: openaccess@tue.nl providing details and we will investigate your claim. Download date: 02. Mar. 2023 STOCHASTIC DYNAMIC PROGRAMMING SUCCESSIVE APPROXIMATIONS AND NEARLY OPTIMAL STRATEGIES FOR MARKOV DECISION PROCESSES AND MARKOV GAMES STOCHASTIC DYNAMIC PROGRAMMING SUCCESSIVE APPROXIMATIONS AND NEARLY OPTIMAL STRATEGIES FOR MARKOV DECISION PROCESSES AND MARKOV GAMES PROEFSCHRIFT TER VERKRIJGING VJ'>.R DE GRAAD VJ'>.R DOCTOR IN DE TECHNISCHE WETENSCHAPPEN ~ DE TECHNISCHE HOGESCHOOL EINDHOVEN, OP GEZAG VAN DE RECTOR MAGNIFICUS, PROF. IR. J. ERKELENS, VOOR EEN COMMISSIE AJI..NGEWEZEN DOOR HET COLLEGE VJ'>.R DEKANEN IN HET OPENBAAF. TE VERDEDIGEN OP VRIJDAG 19 SEPTEMBER 1980 TE 16.00 UUR DOOR JOHANNES VAN DER WAL GEBOREN TE AMSTERDAM 1980 MATHEMATISCH CENTRUM, AMSTERDAM Dit proefschrift is goedgekeurd door de promotoren Prof.dr. J. Wessels en Prof.dr. J.F. Benders Aan Willemien Aan mijn moeder CONTENTS CHAPTER 1 • GENERAL INTRODUCTION 1.1. Informal description of the models 1.2. The functional equations 3 1.3. Review of the existing algorithms 4 1.4. Summary of the following chapters 6 1.5. Formal description of the MOP model 9 1. 6. Notations 13 CHAPTER 2. THE GENERAL TOTAL REWARD MOP 2.1. Introduction 17 2.2. Some preliminary results 18 2.3. The finite-stage MDP 22 2.4. The optimality equation 26 2.5. The negative case 28 2.6. The restriction to Markov strategies 30 2.7. Nearly-optimal strategies 32 CHAPTER 3. SUCCESSIVE APPROXIMATION METHODS FOR THE TOTAL-REWARD MOP 3.1. Introduction 43 3.2. Standard successive approximations 44 3.3. Successive approximation methods and go-ahead functions 49 3.4. The operators L (n) and U 53 0 0 3.5. The restriction to Markov strategies in Uv 58 0 3.6. Value-oriented successive approximations 61 CHAPTER 4. THE STRONGLY CONVERGENT MOP 4.1. Introduction 65 4.2. Conservingness and optimality 70 4.3. Standard successive approximations 73 4.4. The policy iteration method 74 4.5. Strong convergence and Liapunov functions 76 4.6. The convergence of U~v to v* 80 4.7. Stationary go-ahead functions and strong convergence 86 4.8. Value-oriented successive approximations 88 CHAPTER 5. THE CONTRACTING MDP 5.1. Introduction 93 5.2. The various contractive MDP models 94 5.3. Contraction and strong convergence 103 5.4. Contraction and successive approximations 104 5.5. The discounted MOP with finite state and action spaces 108 5.6. Sensitive optimality 115 CHAPTER 6. INTRODUCTION TO THE AVERAGE-REWARD MDP 6.1. Optimal stationary strategies 117 6.2. The policy iteration method 119 6.3. Successive approximations 123 CHAPTER 7. SENSITIVE OPTIMALITY 7.1. Introduction 129 7.2. The equivalence of k-order average optimality and (k-1)-discount optimality 131 7.3. Equivalent successive approximation methods 138 CHAPTER 8. POLICY ITERATION, GO-AHEAD FUNCTIONS AND SENSITIVE OPTIMALITY 8.1. Introduction 141 8.2. Some notations and preliminaries 142 8.3. The Laurent series expansion of L , (h)v (f) 146 6 0 6 8.4. The policy improvement step 149 B.S. The convergence proof 153 CHAPTER 9. VALUE-ORIENTED SUCCESSIVE APPROXIMATIONS FOR THE AVERAGE- REWARD MOP 9.1. Introduction 159 9.2. Some preliminaries 162 9.3. The irreducible case 163 9.4. The general unichain case 166 9.5. Geometric convergence for the unichain case 171 9.6. The communicating case 173 9.7. Simply connectedness 178 9.8. Some remarks 179 CHAPTER 10. INTRODUCTION TO THE TWO-PERSON ZERO-SUM MARKOV GAME 10.1. The model of the two-person zero-sum Markov game 183 10.2. The finite-stage Markov game 185 10.3. Two-person zero-sum Markov games and the restriction to Markov strategies 190 10.4. Introduction to the oo-stage Markov game 193 CHAPTER 11 • THE CONTRACTING MARKOV GAME 11.1. Introduction 197 11.2. The method of standard successive approximations 201 11.3. Go-ahead functions 203 11.4. Stationary go-ahead functions 206 11.5. Policy iteration and value-oriented methods 209 11.6. The strongly convergent Markov game 212 CHAPTER 12 • THE POSITIVE MARKOV GAME WHICH CAN BE TERMINATED BY THE MINIMIZING PLAYER 12.1. Introduction 215 12.2. Some preliminary results 218 12. 3. Bounds on v * and nearly-optimal stationary strategies 222 CHAPTER 13. SUCCESSIVE APPROXIMATIONS FOR THE AVERAGE-REWARD MARKOV GAME 13.1. Introduction and some preliminaries 227 13.2. The unichained Markov game 232 13.3. The functional equation Uv = v+ge has a solution 235 References 239 Symbol index 248 Samenvatting 250 Curriculum vitae 253 CHAPTER 1 GENERAL INTRODUCTION

Description:

de beschrijving van stochastische systemen waarvan het gedrag door een of . de positieve Markov spel waarin een van de twee spelers voortdurend

Stochastic dynamic programming : successive approximations and nearly optimal strategies for PDF

276 Pages·2014·6.87 MB·Dutch

Checking for file health...

Download

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Download Stochastic dynamic programming : successive approximations and nearly optimal strategies for PDF Free - Full Version

by Unknow| 2014| 276 pages| 6.87| Dutch

Download Stochastic dynamic programming : successive approximations and nearly optimal strategies for by in PDF format completely FREE. No registration required, no payment needed. Get instant access to this valuable resource on PDFdrive.to!

Free Download PDF

About Stochastic dynamic programming : successive approximations and nearly optimal strategies for

de beschrijving van stochastische systemen waarvan het gedrag door een of . de positieve Markov spel waarin een van de twee spelers voortdurend

Detailed Information

Author:	Unknown
Publication Year:	2014
Pages:	276
Language:	Dutch
File Size:	6.87
Format:	PDF
Price:	FREE

Download Free PDF

Safe & Secure Download - No registration required

Why Choose PDFdrive for Your Free Stochastic dynamic programming : successive approximations and nearly optimal strategies for Download?

100% Free: No hidden fees or subscriptions required for one book every day.
No Registration: Immediate access is available without creating accounts for one book every day.
Safe and Secure: Clean downloads without malware or viruses
Multiple Formats: PDF, MOBI, Mpub,... optimized for all devices
Educational Resource: Supporting knowledge sharing and learning

Frequently Asked Questions

Is it really free to download Stochastic dynamic programming : successive approximations and nearly optimal strategies for PDF?

Yes, on https://PDFdrive.to you can download Stochastic dynamic programming : successive approximations and nearly optimal strategies for by completely free. We don't require any payment, subscription, or registration to access this PDF file. For 3 books every day.

How can I read Stochastic dynamic programming : successive approximations and nearly optimal strategies for on my mobile device?

After downloading Stochastic dynamic programming : successive approximations and nearly optimal strategies for PDF, you can open it with any PDF reader app on your phone or tablet. We recommend using Adobe Acrobat Reader, Apple Books, or Google Play Books for the best reading experience.

Is this the full version of Stochastic dynamic programming : successive approximations and nearly optimal strategies for?

Yes, this is the complete PDF version of Stochastic dynamic programming : successive approximations and nearly optimal strategies for by Unknow. You will be able to read the entire content as in the printed version without missing any pages.

Is it legal to download Stochastic dynamic programming : successive approximations and nearly optimal strategies for PDF for free?

https://PDFdrive.to provides links to free educational resources available online. We do not store any files on our servers. Please be aware of copyright laws in your country before downloading.

The materials shared are intended for research, educational, and personal use in accordance with fair use principles.