University of Huddersfield Repository Jilani, Rabia Learning Static Knowledge for AI Planning Domain Models via Plan Traces Original Citation Jilani, Rabia (2017) Learning Static Knowledge for AI Planning Domain Models via Plan Traces. Doctoral thesis, University of Huddersfield. This version is available at http://eprints.hud.ac.uk/id/eprint/34414/ The University Repository is a digital collection of the research output of the University, available on Open Access. Copyright and Moral Rights for the items on this site are retained by the individual author and/or other copyright owners. Users may access full items free of charge; copies of full text items generally can be reproduced, displayed or performed and given to third parties in any format or medium for personal research or study, educational or notforprofit purposes without prior permission or charge, provided: • The authors, title and full bibliographic details is credited in any copy; • A hyperlink and/or URL is included for the original metadata page; and • The content is not changed in any way. For more information, including our policy and submission procedure, please contact the Repository Team at: [email protected]. http://eprints.hud.ac.uk/ LEARNING STATIC KNOWLEDGE FOR AI PLANNING DOMAIN MODELS VIA PLAN TRACES RABIA JILANI A thesis submitted to the University of Huddersfield in partial fulfilment of the requirements for the degree of Doctor of Philosophy The University of Huddersfield August 2017 COPYRIGHT STATEMENT i. The author of this thesis (including any appendices and/or schedules to this thesis) owns any copyright in it (the “Copyright”) and s/he has given The University of Huddersfield the right to use such copyright for any administrative, promotional, educational and/or teaching purposes. ii. Copies of this thesis, either in full or in extracts, may be made only in accordance with the regulations of the University Library. Details of these regulations may be obtained from the Librarian. This page must form part of any such copies made. iii. The ownership of any patents, designs, trademarks and any and all other intellectual property rights except for the Copyright (the “Intellectual Property Rights”) and any reproductions of copyright works, for example graphs and tables (“Reproductions”), which may be described in this thesis, may not be owned by the author and may be owned by third parties. Such Intellectual Property Rights and Reproductions cannot and must not be made available for use without the prior written permission of the owner(s) of the relevant Intellectual Property Rights and/or Reproductions i ABSTRACT Learning is fundamental to autonomous behaviour and from the point of view of Machine Learning, it is the ability of computers to learn without being programmed explicitly. Attaining such capability for learning domain models for Automated Planning (AP) engines is what triggered research into developing automated domain-learning systems. These systems can learn from training data. Until recent research it was believed that working in dynamically changing and unpredictable environments, it was not possible to construct action models a priori. After the research in the last decade, many systems have proved effective in engineering domain models by learning from plan traces. However, these systems require additional planner oriented information such as a partial domain model, initial, goal and/or intermediate states. Hence, a question arises - whether or not we can learn a dynamic domain model, which covers all domain behaviours from real-time action sequence traces only. The research in this thesis extends an area of the most promising line of work that is connected to work presented in an REF Journal paper. This research aims to enhance the LOCM system and to extend the method of Learning Domain Models for AI Planning Engines via Plan Traces. This method was first published in ICAPS 2009 by Cresswell, McCluskey, and West (Cresswell, 2009). LOCM is unique in that it requires no prior knowledge of the target domain; however, it can produce a dynamic part of a domain model from training. Its main drawback is that it does not produce static knowledge of the domain, and its model lacks certain expressive features. A key aspect of research presented in this thesis is to enhance the technique with the capacity to generate static knowledge. A test and focus for this PhD is to make LOCM able to learn static relationships in a fully automatic way in addition to the dynamic relationships, which LOCM can already learn, using plan traces as input. We present a novel system - The ASCoL (Automatic Static Constraints Learner) which provides a graphical interface for visual representation and exploits directed graph discovery and analysis technique. It has been designed to discover domain-specific static relations/constraints automatically in order to enhance planning domain models. The ASCoL method has wider applications. Combined with LOCM, ASCoL can be a useful tool to produce benchmark domains for automated planning engines. It is also useful as a debugging tool for improving existing domain models. We have evaluated ASCoL on fifteen different IPC domains and on different types of goal-oriented and random-walk plans as input training data and it has been shown to be effective. ii TABLE OF CONTENTS Chapter 1 - Introduction ........................................................................................ 1 Overview ........................................................................................................... 1 Difference between Autonomy and Automaticity .................................................. 1 Rationale of the Research .................................................................................... 5 Motivation ......................................................................................................... 6 Novel Contributions of Thesis ............................................................................... 8 Thesis Structure and Highlights .......................................................................... 10 Summary ........................................................................................................ 12 Chapter 2 - Background and Literature Review ....................................................... 13 2.1 Theoretical Underpinning ............................................................................. 13 I - Automated Planning (AP) ........................................................................... 14 Assumptions in Automated Planning ................................................................. 16 Classical planning .......................................................................................... 17 Different classes of Planning ........................................................................... 19 Planning problem ........................................................................................... 20 Planners ....................................................................................................... 29 STRIPS and other Planning Techniques ............................................................. 31 II - Knowledge Engineering (KE) ...................................................................... 32 Bottleneck of KBS: Knowledge Acquisition (KA) ................................................. 33 Knowledge Acquisition in literature .................................................................. 34 Knowledge Representation (KR) for Planning and Scheduling .............................. 35 Knowledge Engineering for Planning and Scheduling (KEPS) ............................... 38 III - Graph Theory ......................................................................................... 43 Graphs in Knowledge Engineering for Automated Planning .................................. 46 2.2 The Scope of this Research .......................................................................... 49 Objective and Learning Problem ...................................................................... 50 2.3 Learning from Plan Traces ............................................................................ 53 2.4 Illustration of the Controlled Search Problem .................................................. 56 iii Example: TPP Domain .................................................................................... 58 2.5 The Problem Domains .................................................................................. 61 2.6 Related Work ............................................................................................. 65 Specialised Related Work ................................................................................ 69 Summary ........................................................................................................ 70 Chapter 3 - Learning in Autonomous Systems ......................................................... 71 3.1 Approaches for Autonomous Learning ............................................................ 72 3.1.1 Policy Learning ...................................................................................... 72 3.1.2 Environmental Modelling ........................................................................ 73 3.1.3 Planning Domain Model Learning ............................................................. 73 3.1.4 Specialised Knowledge Acquisition ........................................................... 77 3.2 LOCM Family of Algorithms .......................................................................... 79 3.2.1 LOCM ................................................................................................... 79 3.2.2 LOCM2 ................................................................................................. 81 3.3.3 Experimental Work with LOCM ................................................................ 82 Summary ........................................................................................................ 85 Chapter 4 - KE Tools and Comparative Analysis ...................................................... 86 4.1 KE Tools for Comparative Analysis ................................................................ 87 4.1.1 Opmaker .............................................................................................. 87 4.1.2 SLAF .................................................................................................... 87 4.1.3 ARMS ................................................................................................... 88 4.1.4 Opmaker2 ............................................................................................ 88 4.1.5 LSO-NIO .............................................................................................. 89 4.1.6 RIM ..................................................................................................... 89 4.1.7 AMAN .................................................................................................. 90 4.2 Criteria for Evaluating Tools ......................................................................... 91 Input Requirements ....................................................................................... 91 Provided Output ............................................................................................ 91 Language ..................................................................................................... 91 Noise in Plans ................................................................................................ 91 iv Refinement ................................................................................................... 91 Operational Efficiency ..................................................................................... 92 User Experience ............................................................................................ 92 Availability and Usage .................................................................................... 92 4.3 Tools Evaluation ......................................................................................... 92 Inputs Requirements ...................................................................................... 92 Provided Output ............................................................................................ 92 Language ..................................................................................................... 93 Noise in Plans ................................................................................................ 93 Refinement ................................................................................................... 93 Operational Efficiency ..................................................................................... 93 User Experience ............................................................................................ 94 Availability and Usage .................................................................................... 94 4.4 Recommendations and Reviews .................................................................... 95 Chapter 5 - ASCoL ............................................................................................... 97 5.1 Introduction ............................................................................................... 97 5.1.1 Preliminaries ....................................................................................... 100 5.1.2 Assumptions of ASCoL ......................................................................... 101 5.2 ASCoL Algorithm ....................................................................................... 102 5.2.1 Step 1: Generation of Vertices Pairs ...................................................... 102 5.2.2 Step 2: Generation of Digraphs ............................................................. 106 5.2.3 Step 3: Analysis of the Directed Graphs ................................................. 109 5.2.4 Conversion to PDDL ............................................................................. 113 5.2.5 Discussion .......................................................................................... 114 5.3 Implementation ........................................................................................ 115 5.3.1 System Design & Development ............................................................. 116 5.3.2 ASCoL Application Architecture ............................................................. 126 5.3.3 Testing .............................................................................................. 127 5.4 Argument for Extracting Same-Typed Static Relations ................................... 127 5.4.1 Freecell Domain .................................................................................. 129 v 5.4.2 Logistics Domain ................................................................................. 129 5.4.3 Miconic Domain ................................................................................... 130 5.4.4 Conclusion .......................................................................................... 132 Summary of the Chapter ................................................................................. 133 Chapter 6 - Evaluation ....................................................................................... 134 6.1 Experimental Setup ................................................................................... 135 6.2 Types of Static Facts ................................................................................. 138 6.3 Evaluation Metrics ..................................................................................... 141 6.3.1 Accuracy ............................................................................................ 141 6.3.2 Precision ............................................................................................ 142 6.3.3 Statistical Binary Classification .............................................................. 142 6.4 Interesting/Peculiar Models ........................................................................ 143 6.4.1 TPP Domain ........................................................................................ 143 6.4.2 Zenotravel Domain .............................................................................. 145 6.4.3 Mprime Domain ................................................................................... 146 6.5 Learning Static Relations Using ASCoL ......................................................... 147 6.6 Significant Experimental Results ................................................................. 153 6.6.1. Freecell Domain ................................................................................. 153 6.6.2 TPP Domain ........................................................................................ 154 6.6.3. Miconic Domain .................................................................................. 154 6.6.4. Gold-Miner Domain ............................................................................. 155 6.6.5. PegSolitaire Domain ........................................................................... 155 6.6.6. Mprime Domain .................................................................................. 156 6.7 Impact of Differently-Generated Plans ......................................................... 156 6.7.1 LOCM ................................................................................................. 158 6.7.2 ASCoL ................................................................................................ 160 6.7.3 Discussion .......................................................................................... 161 Summary ...................................................................................................... 162 Chapter 7 - Extended Uses of ASCoL ................................................................... 164 7.1 Analysis of Domain Model using Static Graphs .............................................. 165 vi 7.1.1 Extended Static Relations (ESRs) .......................................................... 165 7.1.2 Shift Operators or Static Modifier (OSM) .................................................. 168 7.1.3 Conclusion .......................................................................................... 171 7.2 Benchmarking Planning Domains – ASCoL + LOCM ....................................... 172 7.2.1 Assumptions ....................................................................................... 174 7.2.2 Complexity of Input ............................................................................. 176 7.2.3 Complexity of Card Games Modelling ..................................................... 176 7.2.4 Performance of Automatic Models Generation ......................................... 177 7.2.5 Performance of State-of-the-Art Planners ............................................... 178 7.2.6 Lessons Learnt .................................................................................... 179 Summary ...................................................................................................... 180 Chapter 8 - Conclusion & Future Work ................................................................. 181 8.1 Thesis Summary ....................................................................................... 181 8.1.1 Requirements and Restrictions of the System ......................................... 182 8.1.2 Summary of Chapters .......................................................................... 184 8.1.3 Potential Application areas of this research ............................................. 185 8.2 Future Work ......................................................................................... 187 Bibliography ..................................................................................................... 188 Appendices ....................................................................................................... 197 Appendix A .................................................................................................... 197 Benchmark: Freecell Domain ......................................................................... 197 LOCM: Freecell Domain ................................................................................ 202 LOCM: Freecell Problem Instance ................................................................... 209 Appendix B .................................................................................................... 210 B-1. Result of type Fuel in Donate operator of Mprime Domain. ......................... 210 B-2. Result of Unload, Load and Buy operators in TPP Domain........................... 213 vii LIST OF FIGURES Figure 1.1: Autonomic Process ................................................................................ 3 Figure 1.2: Autonomic Architecture (McCluskey 2015) ............................................... 4 Figure 2.1: Logical Separation between Planning Engine and Domain Model ............... 17 Figure 2.2: The Blocks Domain ............................................................................. 22 Figure 2.3: Blocks Problem ................................................................................... 23 Figure 2.4: Typical Blocks World problem ............................................................... 23 Figure 2.5: Planning as an independent component ................................................. 24 Figure 2.6: Typical STRIPS Operator ...................................................................... 31 Figure 2.7: Old idea of KBS development ............................................................... 34 Figure 2.8: An Idealised Planning KE Environment (Biundo, Aylett et al. 2003) ........... 43 Figure 2.9: An FDNA Graph is a Topology of Receiver-Feeder Nodes. ......................... 45 Figure 2.10: Planning domain design processes in itSIMPLE2.0 ................................. 46 Figure 2.11: Declaration of language (Vodrázka and Chrpa 2010) ............................. 47 Figure 2.12: totally ordered plan from Blocks Domain .............................................. 52 Figure 2.13: Input Output Structure of ASCoL ......................................................... 53 Figure 2.14: Planning as a Tree Search .................................................................. 56 Figure 2.15: Graph (non-hierarchical) converted to Tree (hierarchical) ...................... 57 Figure 2.16: Control Searched Planning using Constraints in Preconditions ................. 58 Figure 2.17: Microsoft Windows Freecell Game ....................................................... 62 Figure 2.18: Type Hierarchy in Freecell Domain ...................................................... 63 Figure 2.19: homefromfreecell - Freecell domain (Left) and LOCM (Right) .................. 64 Figure 2.20: Induced FSMs for card, num and suit in action homefromfreecell ............ 65 Figure 3.1: Autonomy and Processing required by Classes of Learning Sources ........... 76 Figure 3.2: IPC Ferry Domain Graph ...................................................................... 83 Figure 3.3: LOCM Induced Ferry Domain Graph ...................................................... 83 Figure 3.4: Four-operator Blocks Domain Graph ...................................................... 84 Figure 3.5: Four-operator Blocks Domain Graph by LOCM ........................................ 84 Figure 4.1: A screenshot of Opmaker ..................................................................... 87 Figure 4.2: Comparison between RIM and ARMS ..................................................... 90 Figure 4.3: General Architecture of Domain Learning System.................................... 95 Figure 5.1: ASCoL Method Overview ...................................................................... 99 Figure 5.2: Input - A training sequence of length 12 (Freecell) ................................ 104 Figure 5.3: Output - Action Set containing all actions that satisfy Assumption 3 (Freecell) ....................................................................................................................... 104 Figure 5.4: Bigraph for generating pairs of arguments from operator definition ......... 105 Figure 5.5: A directed graph with a linear structure (Pair 1: type num) .................... 108 viii
Description: