ebook img

TREAT: A New and Efficient Match Algorithm for Artificial Intelligence Production Systems PDF

144 Pages·1988·8.887 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview TREAT: A New and Efficient Match Algorithm for Artificial Intelligence Production Systems

Daniel P. Miranker Department of Computer Sciences The University of Texas at Austin TREAT: A New and Efficient Match Algorithm for AI Production Systems Pitman, London Morgan Kaufmann Publishers, Inc., San Mateo, California PITMAN PUBLISHING 128 Long Acre, London WC2E 9AN A Division of Longman Group UK Limited © Daniel P. Miranker 1990 First published 1990 Available in the Western Hemisphere from MORGAN KAUFMANN PUBLISHERS, INC., 2929 Campus Drive, San Mateo, California 94403 ISSN 0268-7526 British Library Cataloguing in Publication Data Miranker, Daniel P. TREAT: a new and efficient match algorithm for AI production systems.—(Research notes in artificial intelligence, ISSN 0268-7526). 1. Expert systems. Design. Algorithms. Treat I. Title II. Series 658.5'3 ISBN 0-273-08793-2 Library of Congress Cataloging in Publication Data Miranker, Daniel. Treat: a new and efficient match algorithm for AI production systems / Daniel Miranker. p. cm.—(Research notes in artificial intelligence) Bibliography: p. ISBN 0-934613-71-0 (Morgan Kaufmann) 1. Expert systems (computer science) 2. Algorithms. 3. Artificial intelligence. 4. Parallel processing (electronic computers) I. Title. II. Series. QA76.76.E95M57 1990 006.3—DC19 All rights reserved; no part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise without either the prior written permission of the publishers or a licence permitting restricted copying issued by the Copyright Licencing Agency Ltd, 33-34 Alfred Place, London WOE 7DP. This book may not be lent, resold, hired out or otherwise disposed of by way of trade in any form of binding or cover other than that in which it is published, without the prior consent of the publishers. Reproduced and printed by photolithography in Great Britain by Biddies Ltd, Guildford List of Figures Figure 2-1 An Example Production Rule 5 Figure 2-2 An Example Production Rule with a Negated Condition 6 Element Figure 2-3 The Generalized Structure of a Shared Memory Machine 9 Figure 3-1 Production System Rule as Database Query 13 Figure 3-2 Correspondence of Production Systems to Database Systems 14 Figure 3-3 Black Box View of a Production System Algorithm 15 Figure 3-4 RETE Match Network for the Rule in Figure 2-1 19 Figure 3-5 Initial State of the RETE Network 21 Figure 3-6 Activity of the RETE Match During an Addition 22 Figure 3-7 Activity of the RETE Match During a Deletion 23 Figure 4-1 Redundant Storage of State in Different Beta-Memories 26 Figure 4-2 Rule System Displaying Negated Condition Element Problem 28 Figure 4-3 Abstract Algorithm Illustrating TREAT 30 Figure 4-4 Initial State as Stored by TREAT 31 Figure 4-5 Activity of TREAT During an Addition 32 Figure 4-6 Activity of TREAT During a Deletion 33 Figure 4-7 Counting the Comparisons for TREAT 38 Figure 4-8 Sizing the Beta-memories 39 Figure 4-9 Measurements of the Number and Type of Conditions per Rule 41 Figure 4-10 Summary of the Gross Characteristics of the Studied Systems 43 Figure 4-11 The Number of Comparisons Required to Compute Variable 45 Bindings for Each OPS5 Implementation Figure 5-1 Functional Division of the DADO Tree 51 Figure 5-2 Hyper-H Embedding of a Binary Tree 52 Figure 5-3 The Leiserson Chip Design 52 Figure 5-4 Two Leiserson Chips Hooked Together 52 Figure 5-5 The DADOl Prototype Processing Element 56 Figure 5-6 The DAD02 Prototype Processing Element 56 Figure 5-7 Queuing Model of DADO 1 a (no buffering) 59 Figure 5-8 Queuing Model of DADO 1 b (buffering) 60 Figure 5-9 Queuing Model of DAD02a (no buffering) 61 Figure 5-10 Queuing Model of DAD02b (buffering) 62 Figure 5-11 Statistics Characterizing the DADO Instruction Steam 62 Figure 5-12 Passive Queue Construct 63 Figure 5-13 DADO SIMD Instruction-Stream Generator Model 64 Figure 5-14 RESQ Queuing model of a DADOl PE 65 Figure 5-15 RESQ Queuing model of a DAD02 PE 65 Figure 5-16 Throughput Results for the Four DADO Models 66 Figure 5-17 Throughput vs. Length of Data Dependent Operations 67 Figure 5-18 Performance of Deeper DADO Trees 68 Figure 5-19 Software Layering for the DADO Architecture 69 Figure 5-20 Illustration of Tree Neighbor Communication 72 Figure 5-21 Sequentially Loading DADO 75 Figure 5-22 Associative Probing 76 Figure 5-23 Locus of Control Among PEs 78 Figure 6-1 NETL Hardware Organization 82 Figure 6-2 Organization of a Connection Machine Processing Element 86 Figure 6-3 Alpha Operator Applied to Xappings 97 Figure 6-4 The Beta Operator 98 Figure 7-1 Gupta's Production System Machine 104 Figure 7-2 Expected Speed From Gupta's Production System Machine 104 Figure 7-3 Proportion of Time Spent in Each Phase of the Production 110 System Cycle, Using TREAT-OPS5 and Full Distribution Partitioning Figure 7-4 Speed-Up of Partial Match 111 Figure 7-5 Size per Cycle of the Intersection of Affect Set and Active Set 113 Figure 7-6 Speed-up by Partition 114 Figure 7-7 Speed-up by PEs per Partition 117 Figure 7-8 Waltz Speed-Ups By Number of PEs 119 Figure 7-9 The Performance, Actual and Predicted, of Three OPS5 Implementations X Acknowledgements I would like to thank Sal Stolfo, my thesis advisor, for thinking of the DADO machine in the first place and for guiding my work, also for demonstrating the value of honesty and integrity, and not least, for having enough faith and trust in me to give me enough rope to hang myself. Kudos for Rick Reed, for his great contributions toward the implementation of OPS5 on the DADO machine and in the process painstakingly uncovering the bugs in everyone else's work, including my own. The DADO project is a large project - Andrew Comas, Eugene Dong, Zdenek Radouch and Jody Weiss were responsible for the construction of the DAD02. My fellow graduate students on the project, Mark Lerner, Andy Lowrie, Russell Mills, Al Pasik, Steve Taylor and Michael van Beima and staff members Rick Reed, Lee Woodbury and Philip Yuen, as well as my officemate Terry Boult, all contributed to many lively discussions about the DADO project. Many of the ideas presented in this thesis arose from these discussions. I am grateful for the editorial help provided by Pandora Setian, Russell Mills and especially my sister-in-law Cathy Miranker, with their help the document as well as my own ability to write has been greatly improved. Graphs of many of the measurements were done with help from Matthew Kallis. Last I am most thankful for the loving friendship and moral support provided by my wife Valin. XI Abstract Due to the dramatic increase in computing power and the concomitant decrease in computing cost that has occurred over the last decade, many researchers are attempting to design computing systems to solve complicated problems or execute tasks that have in the past been performed by human experts. The focus oi Knowledge Engineering is the construction of such complex "expert system" programs. This book will describe the architecture and the software systems embodying the DADO machine, a parallel tree-structured computer designed to provide significant performance improvements over serial computers of comparable hardware complexity in the execution of large expert systems implemented in production system form. The central contribution of this book is a new match algorithm for executing production systems, TREAT, that will be presented and comparatively analyzed with the RETE match algorithm. TREAT, originally designed specifically for the DADO machine architecture can handle efficiently both temporally redundant and nontemporally redundant production system programs. Although the development of the algorithm was motivated by the inadequacies of the parallel versions of existing production system algorithms, it is shown that the TREAT algorithm performs better than the best known sequential algorithm, the RETE match, even on a sequential machine. Xlll In memory of my mother, Phyllis Miranker 1 Introduction 1.1 The Problem Since its inception, the ultimate goal in the field of artificial intelligence (AI) has been to create a machine capable of learning and general problem solving. This goal has proven to be difficult and elusive. Early successes in AI resulted from considering a restricted set of problems in restricted domains; for example, question answering in a blocks world and solving analogy questions on IQ tests [55]. These successes led researchers to consider "real world" domains that ordinarily are in the province of trained human experts, but are still narrow in scope. In the last decade researchers have succeeded in creating expert programs (or expert systems) that are capable of performing medical diagnosis [76], discovering mineral deposits [11] and analyzing electronic circuits, to name a few. The heart of these systems is a knowledge base, containing a large collection of facts, definitions, procedures and heuristic "rules of thumb"acquired from a human expert. More recently researchers have formalized the techniques involved in the development of expert systems and have implemented computer tools and specialized languages [20, 9] that facilitate the creation of expert systems. ACE, a system that recommends preventive maintenance on telephone cables [96], and Rl/XCON, a system that configures VAX computers [49] are two systems, both in commercial use, written directly in an expert system language. Text books have been written so that knowledge engineering may be taught at the undergraduate level [97, 8]. Knowledge engineers are the intermediary between the expert and the system; they extract, formalize, represent, and test the relevant knowledge within a computer program. In consequence expert systems are becoming increasingly important in the commercial environment. Just as more conventional computer technologies offer the potential for higher productivity in the blue-collar work force, it appears that AI expert systems will offer the same productivity increase in the white-collar work force. Articles on AI and expert systems continually published in the business pages of newspapers and magazines [54, 3] substantiate this. Two independent market analyses estimate that AI-related business will grow to nearly $10 billion a year by 1990 [4, 70]. Although the dramatic increase in computer power, available at ever decreasing prices, has made the development of expert systems possible, these systems continue to tax the resources of even large general-purpose computer systems. The benchmark timings of even the smallest student expert-system projects, reported in later chapters of this book, require almost a CPU minute on an IBM 4381 mainframe computer. Large scale expert systems require tens or hundreds of times that amount. The lengthy response times for these systems certainly frustrate system developers and perhaps impede the development of expert-system programs. 1 This book presents a two-fold attack on improving the speed of expert systems based on the production system paradigm. The primary motivation behind the entire work is to investigate the applicability of parallel computation to the production system problem. In that vein, the specification, performance analysis and implementation of the DADO1 machine, a massively parallel tree-structured computer, as well as its layered programming model are part of this book. Two DADO prototypes have been built. The DADOl, a 15 processor machine, has been operational since the spring of 1983. The second larger prototype, DAD02, contains 1,023 8-bit processors and has an aggregate capacity of 570 MIPS and 16 megabytes of RAM storage. DAD02 has been operational since the fall of 1985. The DAD02 is reliable. Smaller copies of DAD02, with 31 to 63 processors, are presently being used outside of Columbia University by researchers at AT&T and the Fifth Generation Computer Corp. The DADO serves as an environment that structures the development of parallel algorithms for production systems. Since its development over 10 years ago, the RETE match algorithm [20] has been assumed to be the best algorithm for the execution of production systems. This assumption was questioned in a conjecture made by McDermott et al. [52]. An early study of DADO [24] determined that parallel versions of the RETE match were inappropriate as algorithms for the DADO machine. Motivated by the constraints imposed by the DADO model, a new production system matching algorithm called TREAT is presented and analyzed. A serendipitous and exciting result is that even in a sequential environment, a comparative empirical analysis of TREAT and RETE shows that TREAT often outperforms the RETE match often by greater than two to one. This analysis substantiates McDermott's conjecture and forms the basic contribution of this thesis. 1.2 Outline of the Book The preliminary chapter of the book contains background material. Production systems are defined and the OPS5 production system language syntax is described. Since the motivation of the work is the introduction of parallel execution of expert systems, a taxonomy of parallel computers is also presented. The next chapter, chapter three, details the algorithmic considerations of production system interpreters, particularly as they relate to parallelism and also how they relate to relational database systems. Chapter three suggests production system interpreters may be decomposed into three nearly independent algorithmic aspects: low-level matching, partitioning and synchronization. The problem of optimizing the execution of production systems has garnered a great deal of attention in the community. These aspects are explained in detail and a synopsis of work related to each aspect is presented. The algorithmic aspect central to the work is the low-level matching issue. Since its development in 1974, the RETE match [21] has been commonly assumed to be the best algorithm to be used to implement production system interpreters. Chapter three includes a detailed description of the RETE match. !DADO is a name not an acronym. 2 Chapter four presents the TREAT algorithm. Specific mention is made of how the strong points of the RETE match in a sequential environment become troublesome in a parallel environment. These observations coupled with ideas related to database query optimization form the basis of the TREAT match algorithm. An interpreter for the OPS5 production system language was implemented using TREAT. Two variations of the TREAT algorithm that incorporate optimization techniques adapted from relational databases were also implemented. Chapter four concludes with the presentation of the comparative performance of the three versions of TREAT and the widely released implementation of OPS5 that is based on the RETE match. A developer of an expert system could improve the performance of his system by buying a larger more expensive conventional computer. Another approach is to augment an inexpensive workstation with a special-purpose coprocessor able to accelerate his application. The DADO machine and its system architecture is intended to be such a coprocessor. Chapter five, presenting the DADO system architecture, has three major components. First is a description of how the the DADO architecture is capable of improving production system performance by spawning large numbers of independent matching processes among its many processors. Second, the execution model of the DADO machine, particularly as it pertains to communication, is quite unusual. The second part of chapter five includes a performance analysis of four alternative communication schemes. The data to drive this analysis was derived from DADOl and used to optimally design the DAD02 machine. The third part of chapter five is a detailed explanation of the layered programming environment for DADO and in particular the parallel systems programming language for the machine. Chapter six describes other parallel AI efforts. Some criticism of each of the machine proposals is included, although a detailed analysis of these other machines is beyond the scope of this book. Chapter seven presents an outline of the implementation of TREAT on the DADO, related work in the field of parallel implementations of production systems and empirical results that form a basis to predict the performance of TREAT on the DADO machine. 1.3 Digest of Conclusions 1. The development of the TREAT production system matching algorithm provides the opportunity to double the performance of forward chaining production systems. 2. TREAT is demonstrated to be faster on serial computers than RETE, the best known production system algorithm to date. 3. The DADO architecture contains a unique execution model where parallel processing is performed by independent instruction streams, (MIMD processing2) but the synchronization of communication operations make DADO 2MIMD and SIMD are defined in section 2.2.2. 3

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.