Table Of ContentAdvances in Industrial Control
Derong Liu
Qinglai Wei
Ding Wang
Xiong Yang
Hongliang Li
Adaptive Dynamic
Programming with
Applications in
Optimal Control
Advances in Industrial Control
Series editors
Michael J. Grimble, Glasgow, UK
Michael A. Johnson, Kidlington, UK
More information about this series at http://www.springer.com/series/1412
Derong Liu Qinglai Wei Ding Wang
(cid:129) (cid:129)
Xiong Yang Hongliang Li
(cid:129)
Adaptive Dynamic
Programming
with Applications
in Optimal Control
123
Derong Liu XiongYang
Institute of Automation Tianjin University
ChineseAcademy of Sciences Tianjin
Beijing China
China
Hongliang Li
Qinglai Wei Tencent Inc.
Institute of Automation Shenzhen
ChineseAcademy of Sciences China
Beijing
China
DingWang
Institute of Automation
ChineseAcademy of Sciences
Beijing
China
ISSN 1430-9491 ISSN 2193-1577 (electronic)
Advances in IndustrialControl
ISBN978-3-319-50813-9 ISBN978-3-319-50815-3 (eBook)
DOI 10.1007/978-3-319-50815-3
LibraryofCongressControlNumber:2016959539
©SpringerInternationalPublishingAG2017
Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpart
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
orinformationstorageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilar
methodologynowknownorhereafterdeveloped.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publicationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfrom
therelevantprotectivelawsandregulationsandthereforefreeforgeneraluse.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authorsortheeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinor
foranyerrorsoromissionsthatmayhavebeenmade.
Printedonacid-freepaper
ThisSpringerimprintispublishedbySpringerNature
TheregisteredcompanyisSpringerInternationalPublishingAG
Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland
Foreword
Nowadays, nonlinearity is involved in all walks of life. It is a challenge for
engineers to design controllers for all kinds of nonlinear systems. To handle this
issue, various nonlinear control theories have been developed, such as theories of
adaptive control, optimal control, and robust control. Among these theories, the
theory of optimal control has drawn considerable attention over the past several
decades.Thisismainlybecauseoptimalcontrolprovidesaneffectivewaytodesign
controllers with guaranteed robustness properties as well as capabilities of opti-
mization and resource conservation that are important in manufacturing, vehicle
emission control, aerospace systems, power systems, chemical engineering pro-
cesses, and many other applications.
The core challenge in deriving the solutions of nonlinear optimal control
problems is that it often boils down to solving certain Hamilton–Jacobi–Bellman
(HJB)equations.TheHJBequationsarenonlinearanddifficulttosolveforgeneral
nonlinear dynamical systems. Indeed, no closed-form solution to such equations
exists, except for very special problems. Therefore, numerical solutions to HJB
equationshavebeendevelopedbyengineers.Toobtainsuchnumericalsolutions,a
highly effective method known as adaptive/approximate dynamic programming
(ADP)canbeused.AdistinctadvantageofADPisthatitcanavoidthewell-known
“curse of dimensionality” of dynamic programming while adaptively solving the
HJB equations. Due tothis characteristic, many elegantADP approaches and their
applications have been developed in the literature during the past several decades.
It is also notable that ADP techniques also provide a link with cognitive
decision-makingmethodsthatareobservedinthehumanbrain,andthus,ADPhas
becomeamainchanneltoachievetrulybrain-likeintelligenceinhuman-engineered
automatic control systems.
Unlike most ADP books, the present book “Adaptive Dynamic Programming
with Applications in Optimal Control” focuses on the principles of emerging
optimal control techniques for nonlinear systems in both discrete-time and
continuous-time domains, and on creating applications of these optimal control
techniques. This book contains three themes:
v
vi Foreword
1. Optimalcontrolfordiscrete-timenonlineardynamicalsystems,coveringvarious
novel techniques used to derive optimal control in the discrete-time domain,
such as general value iteration, θ-ADP, finite approximation error-based value
iteration,policyiteration,generalizedpolicyiteration,anderrorboundsanalysis
of ADP.
2. Optimal control for continuous-time nonlinear systems, discussing the optimal
control for input-affine/input-nonaffine nonlinear systems, robust and optimal
guaranteedcostcontrolforinput-affinenonlinearsystems,decentralizedcontrol
forinterconnectednonlinearsystems,andoptimalcontrolfordifferentialgames.
3. Applications,providingtypicalapplicationsofoptimalcontrolapproachesinthe
areas of energy management in smart homes, coal gasification, and water gas
shift reaction.
ThisbookprovidestimelyandinformativecoverageaboutADP,includingboth
rigorous derivations and insightful developments. It will help both specialists and
nonspecialists understand the new developments in the field of nonlinear optimal
controlusingonline/offlinelearningtechniques.Meanwhile,itwillbebeneficialfor
engineers to apply the developed ADP methods to their own problems in practice.
I am sure you will enjoy reading this book.
Arlington, TX, USA Frank L. Lewis
September 2016
’
Series Editors Foreword
TheseriesAdvancesinIndustrialControlaimstoreportandencouragetechnology
transferincontrolengineering.Therapiddevelopmentofcontroltechnologyhasan
impactonallareasofthecontroldiscipline:newtheory,newcontrollers,actuators,
sensors,newindustrialprocesses,computermethods,newapplications,newdesign
philosophies, and new challenges. Much of this development work resides in
industrialreports,feasibilitystudypapers,andthereportsofadvancedcollaborative
projects. The series offers an opportunity for researchers to present an extended
expositionofsuchnewworkinallaspectsofindustrialcontrolforwiderandrapid
dissemination.
The method of dynamic programming has a long history in the field of optimal
control. It dates back to those days when the subject of control was emerging in a
modernforminthe1950sand1960s.ItwasdevisedbyRichardBellmanwhogave
it a modern revision in a publication of 1954 [1]. The name of Bellman became
linked to an optimality equation, key to the method, and like the name of Kalman
became uniquely associated with the early development of optimal control. One
notableextensiontothemethodwasthatofdifferentialdynamicprogrammingdue
to David Q. Mayne in 1966 and developed at length in the book by Jacobson and
Mayne [2]. Their new technique used locally quadratic models for the system
dynamics and cost functions and improved the convergence of the dynamic pro-
gramming method for optimal trajectory control problems.
Sincethoseearlydays,thesubjectofcontrolhastakenmanydifferentdirections,
but dynamic programming has always retained a place in the theory of optimal
control fundamentals. It is therefore instructive for the Advances in Industrial
Controlmonographseriestohaveacontributionthatpresentsnewwaysofsolving
dynamic programming and demonstrating these methods with some up-to-date
industrial problems. This monograph, Adaptive Dynamic Programming with
Applications inOptimal Control, byDerongLiu, Qinglai Wei, DingWang, Xiong
Yang and Hongliang Li, has precisely that objective.
Theauthorsopenthemonographwithaveryinterestingandrelevantdiscussion
ofanothercomputationallydifficultproblem,namelydevisingacomputerprogram
to defeat human master players at the Chinese game of Go. Inspiration from the
vii
viii SeriesEditors’Foreword
better programming techniques used in the Go-master problem was used by the
authorstodefeatthe“curseofdimensionality”thatarisesindynamicprogramming
methods.
More formally, the objective of the techniques reported in the monograph is to
control in an optimal fashion an unknown or uncertain nonlinear multivariable
system using recorded and instantaneous output signals. The algorithms’ technical
framework is then constructed through different categories of the usual state-space
nonlinearordinarydifferential systemmodel.Thesystemmodelcanbecontinuous
or discrete, have affine or nonaffine control inputs, be subject to no constraints, or
have constraints present. A set of 11 chapters contains the theory for various
formulations of the system features.
Since standard dynamic programming schemes suffer from various implemen-
tation obstacles, adaptive dynamic programming procedures have been developed
tofindcomputablepracticalsuboptimalcontrolsolutions.Akeytechniqueusedby
the authors is that of neural networks which are trained using recorded data and
updated, or “adapted,” to accommodate uncertain system knowledge. The theory
chapters are arranged in two parts: Part 1 Discrete-Time Systems—five chapters;
and Part 2 Continuous-Time Systems—five chapters.
An important feature of the monographs of the Advances in Industrial Control
series is a demonstration of potential or actual application to industrial problems.
After a comprehensive presentation of the theory of adaptive dynamic program-
ming, the authors devote Part 3 of their monograph to three chapter-length appli-
cation studies. Chapter 12 examines the scheduling of energy supplies in a smart
home environment, a topic and problem of considerable contemporary interest.
Chapter 13 uses a coal gasification process that is suitably challenging to demon-
strate the authors’ techniques. And finally, Chapter 14 concerns the control of the
watergasshiftreaction.Inthisexample,thedatausedwastakenfromareal-world
operational system.
This monograph is very comprehensive in its presentation of the adaptive
dynamic programming theory and has demonstrations with three challenging pro-
cesses. It should find a wide readership in both the industrial control engineering
and the academic control theory communities. Readers in other fields such as
computer science and chemical engineering may also find the monograph of con-
siderable interest.
Michael J. Grimble
Michael A. Johnson
Industrial Control Centre
University of Strathclyde
Glasgow, Scotland, UK
SeriesEditors’Foreword ix
References
1. BellmanR(1954)Thetheoryofdynamicprogramming.BulletinoftheAmericanMathematical
Society60(6):503–515
2. JacobsonDH,MayneDQ(1970)Differentialdynamicprogramming,AmericanElsevierPub.
Co.NewYork