ebook img

Dynamic Programming and its Applications. Proceedings of the International Conference on Dynamic Programming and its Applications, University of British Columbia, Vancouver, British Columbia, Canada, April 14–16, 1977 PDF

403 Pages·1978·23.69 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Dynamic Programming and its Applications. Proceedings of the International Conference on Dynamic Programming and its Applications, University of British Columbia, Vancouver, British Columbia, Canada, April 14–16, 1977

ACADEMIC PRESS RAPID MANUSCRIPT REPRODUCTION Proceedings of the International Conference on Dynamic Programming and Its Applications University of British Columbia Vancouver, British Columbia, Canada April 14-16,1977 DYNAMIC PROGRAMMING AND ITS APPLICATIONS edited by Martin L. Puterman FACULTY OF COMMERCE AND BUSINESS ADMINISTRATION THE UNIVERSITY OF BRITISH COLUMBIA VANCOUVER, B.C., CANADA ACADEMIC PRESS New York San Francisco London 1978 A Subsidiary of Harcourt Brace Jovanovich, Publishers COPYRIGHT © 1978, BY ACADEMIC PRESS, INC. ALL RIGHTS RESERVED. NO PART OF THIS PUBLICATION MAY BE REPRODUCED OR TRANSMITTED IN ANY FORM OR BY ANY MEANS. ELECTRONIC OR MECHANICAL, INCLUDING PHOTOCOPY, RECORDING, OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM, WITHOUT PERMISSION IN WRITING FROM THE PUBLISHER. ACADEMIC PRESS, INC. Ill Fifth Avenue, New York, New York 10003 United Kingdom Edition published by ACADEMIC PRESS, INC. (LONDON) LTD. 24/28 Oval Road, London NW1 7DX Library of Congress Cataloging in Publication Data International Conference on Dynamic Programming and its Applications, University of British Columbia, 1977. Dynamic programming and its applications. "Proceedings of the International Conference on Dynamic Programming and its Applications, University of British Columbia, Vancouver, British Columbia, Canada, April 14-16, 1977." 1. Dynamic programming—Congresses. I. Puterman, Martin L. II. Title. T57.83.157 1977 19.7Ό3 78-21621 ISBN 0-12-568150-X PRINTED IN THE UNITED STATES OF AMERICA 78 79 80 81 82 9 8 7 6 5 4 3 2 1 To Karel deLeeuw (1930-1978) CONTRIBUTORS Numbers in parentheses indicate the pages on which authors' contributions begin. Dimitri P. Bertsekas (115), Decision and Control Laboratory, Department of Electrical Engineering and Coordinated Science Laboratory, University of Illinois, Urbana, Illinois 61801 Shelby L. Brumelle (91), Faculty of Commerce and Business Administration, University of British Columbia, Vancouver, British Columbia V6T 1W5, Canada Eric V. Denardo (249, 255, 393), School of Organization and Management, Yale University, New Haven, Connecticut 06520 Cyrus Derman (163), Division of Mathematical Methods in Engineering and Operations Research, Columbia University, New York, New York 10027 Bharat T. Doshi (269), Department of Statistics, Hill Center for the Mathe­ matical Sciences, Rutgers, The State University of New Jersey, Busch Campus, New Brunswick, New Jersey 08903 Am Federgruen (3, 23), Foundation Mathematics Centrum, 2e Boer- haavestraat 49, Amsterdam, The Netherlands JamesFlynn (173), Apt. 303,2116 Pauline Blvd., Ann Arbor, Michigan 48103 Bennett L. Fox (249), Département d'Informatique, Université de Mon­ tréal, C.P. 6128 — Succursale "A", Montréal, P.Q. H3C 3J7 Canada Karl Hinderer (289, 389), Institut fur Mathematische Statistik, der Universität Karlsruhe, Englerstrasse 2, 75 Karlsruhe 1, West Germany Arie Hordijk (3), Rijksuniversiteit te Leiden, Insti tuut voor Toegepaste Wiskunde en Informacia, Wassenaarseweg 80, Postbus 9512, 2300 Ra Leiden, The Netherlands Ronald A. Howard (201), Department of Engineering - Economic Systems, Stanford University, Stanford, California 94305 Seiichi Iwamoto (319), Department of Mathematics, Faculty of Integrated Arts and Science, Hiroshima University, Hiroshima, 730, Japan Mark R. Lembersky (207), Leader, Allocations and Systems Group, Weyer- hauser Company, Tacoma, Washington 99401 xi XU Contributors Gerald J. Lieberman (163), Department of Operations Research, Stanford University, Stanford, California 94305 Thomas L. Morin (53), School of Industrial Engineering, Purdue University, West Lafayette, Indiana 47907 A. John Petkau (221), Department of Mathematics, University of British Co­ lumbia, Vancouver, British Columbia V6T 1W5, Canada Stanley R. Pliska (335), Department of Industrial Engineering and Manage­ ment Sciences, Northwestern University, Evanston, Illinois 60201 Martin L. Puterman (91), Faculty of Commerce and Business Administration, University of British Columbia, Vancouver, British Columbia V6T 1W5, Canada Sheldon M. Ross (163), Department of Operations Research, University of California, Berkeley, California 94720 Uriel G. Rothblum (255), School of Organization and Management, Yale Uni­ versity, New Haven, Connecticut 06520 Manfred Schäl (351), der Universität Bonn, Institut für Angewandte Mathematik, Wegelerstrasse 6, 53 Bonn, West Germany Paul Schweitzer (23), The Graduate School of Management, The University of Rochester, Rochester, New York 14627 Steven E. Shreve (115), Department of Statistics, University of California, Berkeley, California 94720 Henk C. Tijms (3), Interfaculteit der Actuariele Wetenschappen en Econometrie, der Vrije Universiteit, De Boelelaan 1081, Amsterdam, The Netherlands Arthur F. Veinott, Jr. (397), Department of Operations Research, Stanford University, Stanford, California 94305 Carl J. Walters (233), Institute of Animal Resource Ecology, University of British Columbia, Vancouver, British Columbia V6T 1W5, Canada DJ. White (131), Faculty of Economics and Social Studies, Department of Decision Theory, University of Manchester, Manchester M13 9PL, England Jacob Wijngaard (369), Eindhoven University of Technology, P.O. Box 513, Eindhoven, The Netherlands PREFACE Research and interest in dynamic programming have evolved rapidly since publication over twenty years ago of the initial papers in the field. Scholars have investigated the theory and application of dynamic programming at many centres throughout the world, and I and my colleagues at the University of British Columbia felt that an international meeting of prominent research­ ers would be timely and worthwhile. This volume presents the proceedings of the International Conference on Dynamic Programming and Its Applications held at the University of British Columbia, Vancouver, British Columbia, Canada, April 14-16,1977. These proceedings consist of twenty papers and an edited summary of a panel discussion on the state of the art and future directions for dynamic pro­ gramming. All of the papers have been reviewed by me and, in most cases, by external referees. The papers have been divided into three categories: surveys, theory, and applications. Some of the papers could have been classified differ­ ently, but I felt that the three groupings accurately represent the main focus of each paper. Two of the papers included for publication (the ones by Lember- sky and White) were not presented at the conference because the authors were unable to attend. The survey papers summarize and unify work in several of the main areas of dynamic programming research. The paper of Federgruen, Hordijk, and Tijms presents recurrence conditions for countable state Markov decision problems which ensure that the optimal average reward exists and satisfies the functional equation of dynamic programming. Federgruen and Schweitzer give an extensive analysis of the theory of successive approximation for Markov decision problems including many of the authors' recent results for the undiscounted case. Morin summarizes, unifies, and comments on compu­ tational methods for deterministic, finite horizon problems and includes an ex­ tensive bibliography. Puterman and Brumelle survey work on policy iteration and present and extend their recent work on the relationship between policy iteration and Newton's method. Shreve and Bertsekas present a unified and extremely insightful presentation of several foundational questions. White Xlll XIV Preface summarizes and unifies the theory of action elimination procedures for Markov decision problems. The application papers present practical and theoretical applications of dy­ namic programming methodology. Derman, Lieberman, and Ross outline their recent work on the structure of optimal policies for replacement prob­ lems. Flynn discusses the theory and use of steady-state policies and their ap­ plicability to optimal economic growth problems. Howard gives an anecdotal presentation of his original work on Markov decision problems and comments on why these methods have not been applied; a brief summary of questions and answers following his conference presentation is included. Lembersky summarizes applications of Markov decision process techniques to forest stand management and also presents some related theoretical results. Pet kau shows how backward induction and optimal stochastic control methodology can be used to solve statistical decision problems arising from sequential medi­ cal trials. Walters discusses recent formulation and computational results for fisheries management problems. The theory section contains several original papers. Denardo and Fox give an extended summary of their new computational methods for shortest route problems. Denardo and Rothblum detail their work on problems with affine reward and transition structure. Doshi presents new results on the structure and characterization of optimal policies from a problem in controlled diffu­ sions. Hinderer analyzes methods for approximating dynamic programs including several potentially useful bounds. Iwamoto extends his work on in­ verse dynamic programming to infinite horizon problems and gives some inter­ esting examples. Pliska generalizes Veinott's and Hordijk's results on transient policies to problems with general state spaces and discusses the applicability of these generalized results to a problem in the control of multitype branching processes. Schäl studies negative dynamic programming in an abstract setting and among other results gives conditions that ensure that value iteration con­ verges to the optimal return function. Wijngaard studies Markov decision problems with average reward criterion and shows how an unbounded reward structure can be exploited to prove the existence of optimal policies. Professors Denardo, Hinderer, and Veinott gave short presentations in a panel discussion during the concluding session of the conference. Transcribed, edited, and annotated copies of their comments are included in the final sec­ tion of these proceedings. These thought-provoking remarks survey the cur­ rent vista and highlight some areas that researchers will be studying in the near future. I hope that this volume illustrates the richness and scope of dynamic pro­ gramming research of the past quarter-century and helps to inspire further work in this field. This conference would not have been possible without the generous finan­ cial assistance of the National Research Council of Canada, the Faculty of Commerce of the University of British Columbia, and the Management Preface xv Science Research Centre of the University of British Columbia. Special thanks are due to the Faculty of Commerce for use of the E.D. MacPhee Conference Centre. I also would like to thank the following people for their conscientious refereeing of papers in these proceedings: Dmitri Bertsekas, Shelby Brumelle, Bharat Doshi, Bennett Fox, Frieda Granot, Arie Hordijk, Thomas Morin, Stanley Pliska, Manfred Schäl, Moon Shin, Bruce Sinclair and Henk Tijms. Finally, I am extremely grateful to Mrs. Maryse Ellis for her exceptional typ­ ing of this publication and to my wife, Dorothea Katzenstein, for editorial assistance. MARTIN L. PUTERMAN Dynamic Programming and Its Applications RECURRENCE CONDITIONS IN DENUMERABLE STATE MARKOV DECISION PROCESSES A. Federgruen Mathematisch Centrum Amsterdam A. Hordijk Universiteit van Leiden Leiden H.C. Tijms Vrije Universiteit Amsterdam This paper considers an undiscounted semi-Markov decision problem with denumerable state space and compact metric action spaces. Recurrence conditions on the transition probability matrices associated with the stationary policies are consider­ ed and relations between these conditions are established. Also it is shown that under each of these conditions the optimality equation for the average costs has a bounded solution. I. INTRODUCTION In this paper we consider an undiscounted semi-Markov decision model specified by five objects (I, A(i), ρ^j(a), c(i,a), i(i,a)). We are concerned with a dynamic system which at decision epochs beginning with epoch 0 is observed to be in one of the states of the denumerable state space I. After observing the state of the system, an action must be chosen. Received September 1977 - Revised March 1978 Copyright © 1979 by Academic Press, Inc. 3 All rights of reproduction in any form reserved. ISBN 0-12-568150-X

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.