LEARNING AND APPROXIMATE DYNAMIC PROGRAMMING Scaling Up to the Real World LEARNING AND APPROXIMATE DYNAMIC PROGRAMMING Scaling Up to the Real World Editedby Jennie Si, Andy Barto, Warren Powell, and Donald Wunsch AWiley-IntersciencePublication JOHNWILEY&SONS NewYork • Chichester • Weinheim • Brisbane • Singapore • Toronto Preface Complexartificialsystemshavebecomeintegralandcriticalcomponentsofmodern society. Theunprecedentedrateatwhichcomputers,networks,andotheradvanced technologiesarebeingdevelopedensuresthatourdependenceonsuchsystemswill continuetoincrease. Examplesofsuchsystemsincludecomputerandcommunication networks,transportationnetworks,bankingandfinancesystems,electricpowergrid, oil and gas pipelines, manufacturing systems, and systems for national defense. Theseareusuallymulti-scale,multi-component,distributed,dynamicsystems. While advances in science and engineering have enabled us to design and build complex systems,comprehensiveunderstandingofhowtocontrolandoptimizethemisclearly lacking. Thereisanenormousbodyofliteraturedescribingspecificmethodsforcontrolling specific complex systems based on various simplifying assumptions and requiring a range of performance compromises. While much has been said about complex systems,thesesystemsareusuallytoocomplicatedfortheconventionalmathematical methodologiesthathaveproventobesuccessfulindesigninginstruments. Themere existence of complex systems does not necessarily mean that they are operating under the most desirable conditions with enough robustness to withstand the kinds ofdisturbancesthatinevitablyarise. Thiswasmadeclear,forexample,bythemajor power outage across dozens of cities in the Eastern United States and Canada in Augustof2003. Dynamicprogrammingisawell-known,general-purposemethodtodealwithcom- plexsystems,tofindoptimalcontrolstrategiesfornonlinearandstochasticdynamic systems. ItisbasedontheBellmanequationwhichsuffersfromasevere“curseof dimensionality”(forsomeproblems,therecanevenbethreecursesofdimensional- ity). Thishaslimiteditsapplicationstoverysmallproblems. Thesamemaybesaid oftheclassical“min-max”algorithmsforzero-sumgames,whicharecloselyrelated. Over the past two decades, substantial progress has been made through efforts in multipledisciplinessuchasadaptive/optimal/robustcontrol,machinelearning,neural networks,economics,andoperationsresearch. Forthemostpart,theseeffortshave not been cohesively linked, with multiple parallel efforts sometimes being pursued withoutknowledgeofwhatothershavedone. Amajorgoalofthe2002NSFworkshop wastobringtheseparallelcommunitiestogethertodiscussprogressandtoshareideas. Through this process, we are hoping to better define a community with common interestsandtohelpdevelopacommonvocabularytofacilitatecommunication. Despite the diversity in the tools and languages used, a common focus of these researchershasbeentodevelopmethodscapableoffindinghigh-qualityapproximate solutionstoproblemswhoseexactsolutionsviaclassicaldynamicprogrammingare notattainableinpracticeduetohighcomputationalcomplexityandlackofaccurate knowledgeofsystemdynamics. Attheworkshop,thephraseapproximatedynamic programming(ADP)wasidentifiedtorepresentthisstreamofactivities. Anumberofimportantresultswerereportedattheworkshop,suggestingthatthese newapproachesbasedonapproximatingdynamicprogrammingcanindeedscaleup to the needs of large-scale problems that are important for our society. However, totranslatetheseresultsintosystemsforthemanagementandcontrolofreal-world complexsystemswillrequiresubstantialmulti-disciplinaryresearchdirectedtoward integrating higher-level modules, extending multi-agent, hierarchical, and hybrid systemsconcepts. Thereisalotlefttobedone! This book is a summary of the results presented at the workshop, and is organized with several objectives in mind. First, it introduces the common theme of ADP to alarge,interdisciplinaryresearchcommunitytoraiseawarenessandtoinspiremore researchresults. Second,itprovidesreaderswithdetailedcoverageofsomeexisting ADPapproaches,bothanalyticallyandempirically,whichmayserveasabaselineto developfurtherresults. Third,itdemonstratesthesuccessesthatADPmethodshave alreadyachievedinfurtheringourabilitytomanageandoptimizecomplexsystems. Theorganizationofthebookisasfollows. Itstartswithastrategicoverviewandfuture directions of the important field of ADP. The remainder contains three parts. Part OneaimsatprovidingreadersaclearintroductionofsomeexistingADPframeworks and details on how to implement such systems. Part Two presents important and advancedresearchresultsthatarecurrentlyunderdevelopmentandthatmayleadto importantdiscoveriesinthefuture. PartThreeisdedicatedtoapplicationsofvarious ADPtechniques. TheseapplicationsdemonstratehowADPcanbeappliedtolarge andrealisticproblemsarisingfrommanydifferentfields, andtheyprovideinsights forguidingfutureapplications. Additionalinformationaboutthe2002NSFworkshopcanbefoundat http://www.eas.asu.edu/˜nsfadp JennieSi Tempe,AZ AndrewG.Barto Amherst,MA WarrenB.Powell Princeton,NJ DonaldC.Wunsch Rolla,MO April14,2004 i Acknowledgments Thecontentsofthisbookarebasedontheworkshop: “LearningandApproximate Dynamic Programming,” which was sponsored by the National Science Founda- tion (NSF grant number ECS-0223696), and held in Playacar, Mexico in April of 2002. Thisbookisaresultofactiveparticipationandcontributionfromtheworkshop participantsandthechaptercontributors. Theirnamesandaddressesarelistedbelow. CharlesW.Anderson DepartmentofComputerScience ColoradoStateUniversity FortCollins,CO80523USA S.N.Balakrishnan DepartmentofMechanicalandAerospaceEngineeringandEngineeringMechanics UniversityofMissouri-Rolla Rolla,MO65409USA AndrewG.Barto DepartmentofComputerScience UniversityOfMassachusetts Amherst,MA01003USA DimitriP.Bertsekas LaboratoryforInformationandDecisionSystems MassachusettsInstituteofTechnology Cambridge,MA02139USA ZeungnamBien DepartmentofElectricalEngineeringandComputerScience KoreaAdvancedInstituteofScienceandTechnology Yuseong-gu,Daejeon305-701RepublicofKorea VivekS.Borkar SchoolofTechnologyandComputerScience TataInstituteofFundamentalResearch ii ACKNOWLEDGMENTS iii Mumbai,400005India Xi-RenCao DepartmentofElectricalandElectronicEngineering HongKongUniversityofScienceandTechnology ClearWaterBay,Kowloon,HongKong DanielPuccideFarias DepartmentofMechanicalEngineering MassachusettsInstituteofTechnology Cambridge,MA02139USA ThomasG.Dietterich SchoolofElectricalEngineeringandComputerScience OregonStateUniversity Corvallis,OR97331USA RussellEnns BoeingCompanytheHelicopterSystems 5000EastMcDowellRoad Mesa,AZ85215 DepartmentofElectricalEngineering ArizonaStateUniversity Tempe,AZ85287USA AugustineO.Esogbue IntelligentSystemsandControlsLaboratory GeorgiaInstituteofTechnology Atlanta,GA30332USA SilviaFerrari DepartmentofMechanicalEngineeringandMaterialsScience DukeUniversity Durham,NC27708USA LaurentElGhaoui DepartmentofElectricalEngineeringandComputerScience UniversityofCaliforniaatBerkeley Berkeley,CA94720USA MohammadGhavamzadeh DepartmentofComputerScience UniversityofMassachusetts iv ACKNOWLEDGMENTS Amherst,MA01003USA GregGrudic DepartmentofComputerScience UniversityofColoradoatBoulder Boulder,CO80309USA DongchenHan DepartmentofMechanicalandAerospaceEngineeringandEngineeringMechanics UniversityofMissouri-Rolla Rolla,MO65409USA RonaldG.Harley SchoolofElectricalandComputerEngineering GeorgiaInstituteofTechnology Atlanta,GA30332USA WarrenE.HearnesII IntelligentSystemsandControlsLaboratory GeorgiaInstituteofTechnology Atlanta,GA30332USA DouglasC.Hittle DepartmentofMechanicalEngineering ColoradoStateUniversity FortCollins,CO80523USA Dong-OhKang DepartmentofElectricalEngineeringandComputerScience KoreaAdvancedInstituteofScienceandTechnology Yuseong-gu,Daejeon305-701RepublicofKorea MatthewKretchmar DepartmentofComputerScience ColoradoStateUniversity FortCollins,CO80523USA GeorgeG.Lendaris SystemsScienceandElectricalEngineering PortlandStateUniversity Portland,OR97207USA DerongLiu DepartmentofElectricalandComputerEngineering UniversityofIllinoisatChicago ACKNOWLEDGMENTS v Chicago,IL60612USA SridharMahadevan DepartmentofComputerScience UniversityofMassachusetts Amherst,MA01003USA JamesA.Momoh CenterforEnergySystemsandControl DepartmentofElectricalEngineering HowardUniversity Washington,DC20059USA AngeliaNedich Alphatech,Inc. Burlington,MA01803USA JamesC.Neidhoefer AccurateAutomationCorporation Chattanooga,TN37421USA ArnabNilim DepartmentofElectricalEngineeringandComputerSciences UniversityofCaliforniaatBerkeley Berkeley,CA94720USA WarrenB.Powell DepartmentofOperationsResearchandFinancialEngineering PrincetonUniversity Princeton,NJ08544USA DanilV.Prokhorov ResearchandAdvancedEngineering FordMotorCompany Dearborn,MI48124USA KhashayarRohanimanesh DepartmentofComputerScience UniversityofMassachusetts Amherst,MA01003USA MichaelT.Rosenstein DepartmentofComputerScience UniversityofMassachusetts
Description: