Table Of ContentMachine Intelligence and
Pattern Recognition
Volume 15
Series Editors
L.N. KANAL
and
A. ROSENFELD
University of Maryland
College Park, Maryland, U.S.A.
NORTH-HOLLAND
AMSTERDAM · LONDON · NEW YORK · TOKYO
Parallel Processing
for Artificial Intelligence 2
Edited by
Hiroaki KITANO
Sony Computer Science Laboratory, Japan
and Carnegie Mellon University
Pittsburgh, Pennsylvania, U.S.A.
Vipin KUMAR
University of Minnesota
Minneapolis, Minnesota, U.S.A.
Christian B. SUTTNER
Technical University of Munich
Munich, Germany
NORTH-HOLLAND
AMSTERDAM · LONDON · NEW YORK · TOKYO
ELSEVIER SCIENCE B.V.
Sara Burgerhartstraat 25
P.O. Box 211, 1000 AE Amsterdam, The Netherlands
ISBN: 0 444 81837 5
© 1994 Elsevier Science B.V. All rights reserved.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form
or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written
permission of the publisher, Elsevier Science B.V, Copyright & Permissions Department, P.O. Box
521, 1000 AM Amsterdam, The Netherlands.
Special regulations for readers in the U.S.A. - This publication has been registered with the Copyright
Clearance Center Inc. (CCC), Salem, Massachusetts. Information can be obtained from the CCC about
conditions under which photocopies of parts of this publication may be made in the U.S.A. All other
copyright questions, including photocopying outside of the U.S.A., should be referred to the copyright
owner, Elsevier Science B.V, unless otherwise specified.
No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a
matter of products liability, negligence or otherwise, or from any use or operation of any methods, pro-
ducts, instructions or ideas contained in the material herein.
This book is printed on acid-free paper.
Printed in The Netherlands
V
PREFACE
This book presents a collection of papers from a broad range of parallel processing
research related to artificial intelligence. It is the second book in a series which
emerged from workshops held in conjunction with the International Working
Conference for Artificial Intelligence (IJCAI).The recently held International Workshop
on Parallel Processing for Artificial Intelligence (PPAI-93) took place at IJCAI-93 in
Chambéry, France. It provided a forum for extensive discussions, and many
participants agreed to contribute papers for this book.
With increasing availability of parallel machines and raising interest for large scale and
real world applications, research on parallel processing for AI is gaining its
importance. From the mid-80s, we saw the emergence of massively parallel computers
as commercial products, not just laboratory prototypes. Nowadays, a number of
parallel machines is available, ranging from MIMD computers with hundreds of
high-performance processors to massively parallel systems with several ten thousand
processors. Also, networks of workstations gain increasing importance as virtual
parallel computers, providing high processing capacity at low costs. Besides this,
progress in microprocessor technologies enables many researchers to work on
simulated parallel environments, even if their access to real parallel machines is
limited. Finally, the enforcement of high performance communication networks in the
U.S., Japan, and Europe will change the needs for and availability of computing
resources. It will make large parallel computing resources available both by improving
remote access and by offering subnets as virtual machines by themselves. There is no
doubt that software for utilizing the power of the emerging parallel processing
resources is an increasingly important subject. Looking at the status of parallel
processing research for AI, although a number of applications have been implemented
and delivered, the field is still in its infancy. This book is intended to contribute to the
growth of the field by assembling diverse aspects of research in the area. In this book,
we emphasized diversity, rather than coherency. By compiling diverse research
directions, the book provides an overview on the state of technology.
The contributions have been grouped according to their subject into architectures (3),
languages (4), general algorithms (5), and applications (6). The papers range from
purely theoretical work, simulation studies, algorithm and architecture proposals, to
implemented systems and their experimental evaluation. Beyond a diversity of
possibilities to utilize parallelism for AI, this book also shows the geographical
vi
diversity in which such work is pursued, featuring contributions from Australia (1),
England (1), France (3), Germany (7), Japan (2), Switzerland (1), and the USA (3).
Since this book is a second book in the parallel processing for AI series, it provides
a continued documentation of the research and advances made in the field. The editors
hope that it will inspire readers to investigate the possibilities for enhancing AI systems
by parallel processing and to make new discoveries on their own!
Hiroaki Kitano
Vipin Kumar
Christian Suttner
Parallel Processing for Artificial Intelligence 2
H. Kitano, V. Kumar and C.B. Suttner (Editors)
© 1994 Elsevier Science B.V. All rights reserved. 3
Chapter 1
Hybrid Systems on a Multi-Grain Parallel
Architecture
Gérald Ouvradou, Aymeric Poulain Maubant, André Thépaut
Télécorn Bretagne, Parallelism and Distributed Systems Lab (PSD),
BP 832, 29285 Brest CEDEX, France
{gerald,aymeric,andre} @enstb. enst-bretagne.fr
Abstract
We first show how systems coupling symbolic methods with sub-
symbolic ones -the so-called hybrid systems- can solve efficiently problems
by using advantages from the two types of methods. Then we present a
new flexible multi-grain parallelism architecture, called ArMenX, and we
show how hybrid systems can be executed on this machine.
1. Introduction
Knowledge-based expert systems, connectionnist systems, genetic algo-
rithms based systems... have shown in the recent years how they can handle
problem solving. However, they often have limited capabilities and even
drawbacks (for instance, expert systems do not have any generalization
capabilities and neural networks need extensive initial trainings). Some
authors have tried to build hybrid systems where those techniques are used
together to improve the capabilities of the system. This approach looks
promising but much research is still needed to improve the cooperation
among component problem solvers. Another area of research is the ex-
ploitation of the parallelism inherent in this sort of cooperative problem
solving, especially in view of the complex communication problems that
arise when problem solvers based on different paradigms are put in strong
interaction.
In this paper, we present a novel view of hybridation for problem solving
(section 2), and a multi-grain parallel architecture to support such systems
(section 3). Preliminary results supporting the claim that the architecture
is capable to do so are in section 4. Section 5 contains concluding remarks.
4
2. Hybrid systems
There are several computational approaches to problem solving : numer-
ical calculations (e.g. to solve equations), rule-based reasoning as in Ex-
pert Systems (ES), learning and generalizing as in Artificial Neural Net-
works (ANN) and Genetic Algorithms (GA). None of these approaches -or
paradigms- is a match to every problem. There is a growing interest in
hybrid systems where those paradigms are jointly exploited (ANNs with
ES in [ The90], ANN with GA in [ Mar91], ANNs with conventionnal al-
gorithms in [ LP91], ANNs producing rules in [ BB90] and further work on
perceptron membrane [ Def92], to mention a few). This is motivated by
the hope that hybrid systems can outperform single-paradigms systems, by
keeping only the advantages of the components they are made of.
Three main questions remain to be solved for the successful application of
hybridation :
- To what degree should components cooperate ?
- How does one organize communication among components based on
(widely) different paradigms ?
- Which hardware architectures do support hybrid systems most effi-
ciently ?
We are interested here to construct hybrid systems in which each paradigm
of problem solving is put at the right place in the system and exploited at
the right time in the problem solving as a whole (this timing aspect might
well lead to dynamic system reconfiguration in some cases). This approach
clearly requires deep to very deep coupling in the sense of [ Ker92], this
level of coupling increasing de facto the communication problems between
the different paradigms. These communication problems come from the
fact that data handled by the various problem solvers are not of a similar
nature (symbolic and sub-symbolic data). Some solutions are presented
in [ Ult92] and [ Sta92]. In some cases, coupling takes another dimension
when one component effectively controls some others.
Let us consider an example based on H. Dreyfus know-how [skill] acquisition
theory [ Dre91]. In this theory, there are five levels -or steps- in knowledge
acquisition :
- The beginner uses rules possibly taught by someone else and some-
what irrespective of context.
- The advanced beginner uses the same rules with more sensitivity
to context.
- The competent learner evaluates new rules and tests them. He
uses these new rules in accordance with the former rules and makes a
5
choice according to the result. Essential features are distinguishable
from subordinate features.
- The master [proficiency] can see his goal coming to light from the
subconscious. But he still has to poll for rules he will use.
- The expert doesn't consciously delibarate upon rules anymore. By
the way, he has difficulties to express the rules he uses.
We consider a system with expert systems and artificial neural networks
components, that could mimic these steps in the following sense :
- The ES initially contains a number of rules.
- As new examples are presented, the ANNs enter a training phase.
- Eventually, ANNs create and test new rules,
- The ES see their rule set increase.
- After training, (other) ANNs act as the 'unconscious decision making
process' of the expert to select the few rules critical to the solution
of the problem at hand.
An important point in this example is the sequence of application of the ES
and ANNs paradigms. An other point we want to stress is that a careful
choice must be done among the ANN algorithms family to match to the
best the different phases of the above process1. It is our belief that such a
system could handle various classes of problems which are at present solved
by systems designed on purpose.
However there are some remarks to make before going further. The first
one is that all learning processes don't follow the Dreyfus know-how acqui-
sition theory. This works well for chess playing and driving but probably
not for learning to walk. We point out here the innate or acquired knowl-
edge problem. There are also problems where we could think learning is
done from sub-symbolic mastering to high-level symbolic processing (some
reading methods are in this way). These two remarks lead us to consider
other learning processes where, for instance, ANNs begin and are then re-
layed by ES (or other methods).
Another remark is that it should be possible to have various algorithms
which would be specialized in some form or another of learning process
(various ANNs, G As...). This need of great flexibility inside a general sys-
tem able to tackle various applications is satisfied by the flexible ArMenX
architecture we present in the next section.
1ANNs which extract new rules from data, ANNs which test these new rules and
enhance the ES, and ANNs which select the few important rules among the expert
system rules set might not be of similar nature
6
3. ArMenX
The ArMenX (pronounce : armenex) architecture is organised in three
processing layers involved in a replicated scheme as depicted by the figures
below (figure 1 : 1 ArMenX node ; figure 2 : the ArMenX machine on an
hybrid application). The communication scheme is purely distributed thus,
there is no theoretical limit to the extension of one instancied machine.
Host
: 0
Transputer
: l
! 2 T805
: 3
RAM
bus
32 4 Mo
N
FPGA
:E Xilinx W 24 :
4010
S
RAM
24
128 Ko
DSP
56002
Fig. 1. The ArMenX architecture : 1 node
The upper layer is a message passing MIMD computer. The processor
used in the present version of our machine is the Inmos T805. Connected
to his own memory space, each T805 has access to a 4 Megabytes RAM and
to the middle layer of the machine made of a network of FPG As (XILINX-
4010 in the present version).
This network is structured as a high-bandwidth communication processing
7
ring allowing, for instance, the implementation of a global pipelined process
distributed among the set of FPGAs. The X4010 contains 10,000 gates
associated with interconnect ressources controlled by an SRAM memory
plane. This structure allows to implement any types of combinatorial or
sequential operator running as fast as 50 MHz. This middle layer, so called
Reprogrammable Logical Layer (RLL) offers a very fine-grain of parallelism.
The bottom layer of ArMenX is made of DSP chips thigthly coupled to the
FPGAs but not transversally interconnected as the upper layers are. Each
module contains a DSP processor (Motorola-56002 in the present version)
associated whith a fast 128k word SRAM. These modules are intended to
efficently support vector processing algorithms, typically low-level signal or
image processing.
At present the développer must use 3 tools in order to program the different
layers of the ArMenX machine :
- on the transputer layer, he encodes his supervision algorithms in the
parallel-C or OCCAM languages.
- the FPGA layer is programmed with classical CAD tools (ViewLogic,
Abel). Typically, the hardware thus compiled allows the communication
between a transputer and its DSP as well as the fine-grain treatments.
- the DSPs are programmed with the Motorola development chain (C and
assembly language).
Each of the 3 above steps produces a file which is down loaded at its specific
level.
Each transputer loads the configuration file of its associated FPGA in a
time as short as 100 milliseconds and then programs its associated DSP.
4. Implementation of multiple granularity algorithms on Ar-
MenX
For now algorithms have been tested separately on their specific layers. We
present here briefly the different works which have been done on each layer
and then discuss on the overall integration.
We have first investigated this architecture to implement ANNs. We worked
on handwritten digits recognition [ AOT92]. In this application, each trans-
puter loads part of the synaptic coefficients matrix in the RAM of its DSP
and then presents input vectors series (namely, the digits to be recognized).
The DSPs compute very efficiently the matrix operation
n
V^Y^Wijej
J = l
thus allowing to calculate the neurons outputs. One digit is recognized
in 200 microseconds on a 10 node machine.