Table Of ContentREINFORCEMENT AND
SYSTEMIC MACHINE LEARNING
FOR DECISION MAKING
www.it-ebooks.info
fmatter_fmatter.qxd 6/19/2012 6:40 PM Page ii
IEEE Press
445 Hoes Lane
Piscataway, NJ 08855
IEEE Press Editorial Board
John B. Anderson, Editor in Chief
R. Abhari G. W. Arnold F. Canavero
D. Goldof B-M. Haemmerli D. Jacobson
M. Lanzerotti O. P. Malik S. Nahavandi
T. Samad G. Zobrist
Kenneth Moore, Director of IEEE Book and Information Services (BIS)
www.it-ebooks.info
Reinforcement and
Systemic Machine Learning
for Decision Making
Parag Kulkarni
www.it-ebooks.info
Copyright(cid:1)2012bytheInstituteofElectricalandElectronicsEngineers,Inc.
PublishedbyJohnWiley&Sons,Inc.,Hoboken,NewJersey.Allrightsreserved.
PublishedsimultaneouslyinCanada
Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,ortransmittedinanyform
orbyanymeans,electronic,mechanical,photocopying,recording,scanning,orotherwise,exceptas
permittedunderSection107or108ofthe1976UnitedStatesCopyrightAct,withouteitherthepriorwritten
permissionofthePublisher,orauthorizationthroughpaymentoftheappropriateper-copyfeetothe
CopyrightClearanceCenter,Inc.,222RosewoodDrive,Danvers,MA01923,(978)750-8400,
fax(978)750-4470,oronthewebatwww.copyright.com.RequeststothePublisherforpermission
shouldbeaddressedtothePermissionsDepartment,JohnWiley&Sons,Inc.,111RiverStreet,Hoboken,
NJ07030,(201)748-6011,fax(201)748-6008,oronlineathttp://www.wiley.com/go/permission.
LimitofLiability/DisclaimerofWarranty:Whilethepublisherandauthorhaveusedtheirbestefforts
inpreparingthisbook,theymakenorepresentationsorwarrantieswithrespecttotheaccuracyor
completenessofthecontentsofthisbookandspecificallydisclaimanyimpliedwarrantiesof
merchantabilityorfitnessforaparticularpurpose.Nowarrantymaybecreatedorextendedbysales
representativesorwrittensalesmaterials.Theadviceandstrategiescontainedhereinmaynotbesuitable
foryoursituation.Youshouldconsultwithaprofessionalwhereappropriate.Neitherthepublishernor
authorshallbeliableforanylossofprofitoranyothercommercialdamages,includingbutnot
limitedtospecial,incidental,consequential,orotherdamages.
Forgeneralinformationonourotherproductsandservicesorfortechnicalsupport,please
contactourCustomerCareDepartmentwithintheUnitedStatesat(800)762-2974,outsidethe
UnitedStatesat(317)572-3993orfax(317)572-4002.
Wileyalsopublishesitsbooksinavarietyofelectronicformats.Somecontentthatappearsinprint
maynotbeavailableinelectronicformats.FormoreinformationaboutWileyproducts,visitour
websiteatwww.wiley.com.
LibraryofCongressCataloging-in-PublicationData:
Kulkarni,Parag.
Reinforcementandsystemicmachinelearningfordecisionmaking/Parag
Kulkarni.
p.cm.–(IEEEseriesonsystemsscienceandengineering;1)
ISBN978-0-470-91999-6
1.Reinforcementlearning. 2.Machinelearning. 3.DecisionMaking.I.
Title.
Q325.6.K852012
006.301–dc23
2011043300
PrintedintheUnitedStatesofAmerica
10 9 8 7 6 5 4 3 2 1
www.it-ebooks.info
Dedicated tothe late D.B. Joshi and thelate SavitriJoshi,
who inspired metothinkdifferently
www.it-ebooks.info
CONTENTS
Preface xv
Acknowledgments xix
About the Author xxi
1 Introduction toReinforcement and Systemic Machine Learning 1
1.1. Introduction 1
1.2. Supervised, Unsupervised, andSemisupervised Machine
Learning 2
1.3. Traditional Learning Methods andHistoryof Machine
Learning 4
1.4. What IsMachine Learning? 7
1.5. Machine-LearningProblem 8
1.5.1. Goals of Learning 8
1.6. Learning Paradigms 9
1.7. Machine-LearningTechniques andParadigms 12
1.8. What IsReinforcement Learning? 14
1.9. Reinforcement Function and Environment Function 16
1.10. Need ofReinforcementLearning 17
1.11. Reinforcement Learningand Machine Intelligence 17
1.12. What IsSystemicLearning? 18
1.13. What IsSystemicMachine Learning? 18
1.14. Challengesin Systemic Machine Learning 19
1.15. Reinforcement Machine Learningand Systemic
Machine Learning 19
1.16. Case Study Problem Detection ina Vehicle 20
1.17. Summary 20
Reference 21
2 Fundamentals of Whole-System, Systemic, and Multiperspective
Machine Learning 23
2.1. Introduction 23
2.1.1. What IsSystemicLearning? 24
2.1.2. History 26
vii
www.it-ebooks.info
viii CONTENTS
2.2. What IsSystemicMachine Learning? 27
2.2.1. Event-Based Learning 29
2.3. Generalized SystemicMachine-Learning Framework 30
2.3.1. System Definition 31
2.4. MultiperspectiveDecision Making andMultiperspective
Learning 33
2.4.1. Representation Based on Complete Information 40
2.4.2. Representation Based on PartialInformation 41
2.4.3. Uni-PerspectiveDecisionScenarioDiagram 41
2.4.4. Dual-PerspectiveDecisionScenarioDiagrams 41
2.4.5. MultiperspectiveRepresentativeDecision Scenario
Diagrams 42
2.4.6. QualitativeBelief Networkand ID 42
2.5. Dynamic and InteractiveDecisionMaking 43
2.5.1. InteractiveDecision Diagrams 43
2.5.2. Role of TimeinDecisionDiagrams and Influence
Diagrams 43
2.5.3. Systemic ViewBuilding 44
2.5.4. Integration ofInformation 45
2.5.5. Building RepresentativeDSD 45
2.5.6. Limited Information 45
2.5.7. Role of MultiagentSystem inSystemic Learning 46
2.6. The Systemic Learning Framework 47
2.6.1. Mathematical Model 50
2.6.2. Methods for Systemic Learning 50
2.6.3. AdaptiveSystemic Learning 51
2.6.4. Systemic LearningFramework 52
2.7. System Analysis 52
2.8. Case Study: Need ofSystemicLearninginthe Hospitality
Industry 54
2.9. Summary 55
References 56
3 Reinforcement Learning 57
3.1. Introduction 57
3.2. LearningAgents 60
3.3. Returns and Reward Calculations 62
3.3.1. Episodic andContinuing Task 63
3.4. Reinforcement Learning and AdaptiveControl 63
3.5. Dynamic Systems 66
3.5.1. DiscreteEventDynamic System 67
3.6. Reinforcement Learning and Control 68
www.it-ebooks.info
CONTENTS ix
3.7. MarkovPropertyand MarkovDecisionProcess 68
3.8. Value Functions 69
3.8.1. Action and Value 70
3.9. Learning anOptimal Policy(Model-Based and
Model-FreeMethods) 70
3.10. Dynamic Programming 71
3.10.1. Properties ofDynamicSystems 71
3.11. AdaptiveDynamicProgramming 71
3.11.1. Temporal Difference (TD) Learning 71
3.11.2. Q-Learning 74
3.11.3. UnifiedView 74
3.12. Example:Reinforcement Learning for
Boxing Trainer 75
3.13. Summary 75
Reference 76
4 SystemicMachineLearning andModel 77
4.1. Introduction 77
4.2. AFramework for Systemic Learning 78
4.2.1. Impact Space 80
4.2.2. Interaction-Centric Models 85
4.2.3. Outcome-Centric Models 85
4.3. Capturing the Systemic View 86
4.4. Mathematical Representation of System Interactions 89
4.5. Impact Function 91
4.6. Decision-Impact Analysis 91
4.6.1. Time and Space Boundaries 92
4.7. Summary 97
5 Inference and Information Integration 99
5.1. Introduction 99
5.2. InferenceMechanisms and Need 101
5.2.1. ContextInference 103
5.2.2. InferencetoDetermine Impact 103
5.3. Integrationof Contextand Inference 107
5.4. Statistical Inference andInduction 111
5.4.1. Direct Inference 111
5.4.2. Indirect Inference 112
5.4.3. InformativeInference 112
5.4.4. Induction 112
5.5. Pure Likelihood Approach 112
www.it-ebooks.info
x CONTENTS
5.6. Bayesian Paradigm andInference 113
5.6.1. Bayes’ Theorem 113
5.7. Time-Based Inference 114
5.8. InferencetoBuild aSystemView 114
5.8.1. Information Integration 115
5.9. Summary 118
References 118
6 Adaptive Learning 119
6.1. Introduction 119
6.2. AdaptiveLearning and AdaptiveSystems 119
6.3. What IsAdaptiveMachine Learning? 123
6.4. Adaptation andLearningMethodSelection Based
on Scenario 124
6.4.1. Dynamic Adaptation andContext-AwareLearning 125
6.5. Systemic Learningand AdaptiveLearning 127
6.5.1. Use ofMultiple Learners 129
6.5.2. Systemic AdaptiveMachine Learning 132
6.5.3. Designingan AdaptiveApplication 135
6.5.4. Need ofAdaptiveLearningand Reasons
for Adaptation 135
6.5.5. Adaptation Types 136
6.5.6. Adaptation Framework 139
6.6. CompetitiveLearningand AdaptiveLearning 140
6.6.1. Adaptation Function 142
6.6.2. DecisionNetwork 144
6.6.3. Representation ofAdaptiveLearning Scenario 145
6.7. Examples 146
6.7.1. Case Study: Text-Based AdaptiveLearning 147
6.7.2. AdaptiveLearning for DocumentMining 148
6.8. Summary 149
References 149
7 Multiperspectiveand Whole-System Learning 151
7.1. Introduction 151
7.2. MultiperspectiveContextBuilding 152
7.3. MultiperspectiveDecision Making andMultiperspective
Learning 154
7.3.1. Combining Perspectives 155
7.3.2. InfluenceDiagram and Partial Decision Scenario
Representation Diagram 156
www.it-ebooks.info
CONTENTS xi
7.3.3. RepresentativeDecision Scenario Diagram (RDSD) 160
7.3.4. Example:PDSRD Representations for City
InformationCapturedfrom Different Perspectives 160
7.4. Whole-System Learning and MultiperspectiveApproaches 164
7.4.1. IntegratingFragmented Information 165
7.4.2. MultiperspectiveandWhole-System Knowledge
Representation 165
7.4.3. What Are MultiperspectiveScenarios? 165
7.4.4. ContextinParticular 166
7.5. Case Study Based onMultiperspectiveApproach 167
7.5.1. Traffic Controller Based onMultiperspective
Approach 167
7.5.2. MultiperspectiveApproachModel for Emotion
Detection 169
7.6. Limitations toaMultiperspectiveApproach 174
7.7. Summary 174
References 175
8 IncrementalLearning and Knowledge Representation 177
8.1. Introduction 177
8.2. WhyIncremental Learning? 178
8.3. Learning fromWhat Is Already Learned.. . 180
8.3.1. Absolute Incremental Learning 181
8.3.2. SelectiveIncremental Learning 182
8.4. Supervised Incremental Learning 191
8.5. Incremental Unsupervised Learningand Incremental
Clustering 191
8.5.1. Incremental Clustering: Tasks 193
8.5.2. Incremental Clustering: Methods 195
8.5.3. Threshold Value 196
8.6. Semisupervised Incremental Learning 196
8.7. Incremental and Systemic Learning 199
8.8. Incremental Closeness Value and Learning Method 200
8.8.1. Approach 1for Incremental Learning 201
8.8.2. Approach 2 202
8.8.3. Calculating C Values Incrementally 202
8.9. Learning andDecision-MakingModel 205
8.10. Incremental Classification Techniques 206
8.11. Case Study: Incremental DocumentClassification 207
8.12. Summary 208
www.it-ebooks.info