Table Of ContentConnectionist
Natural Language
Processing
Readings from Conneotion Soienoe
Edited by
Noel Sharkey
University of Exeter
SPRINGER-SCIENCE+BUSINESS MEDIA, B.V.
This compilation's Copyright © 1992 Springer Science+Business Media Dordrecht
Originally published by Kluwer Academic Publishers in 1992
All rights reserved. No part of this publication may be reproduced, stored in a
retrieval system, or transmitted, in any form or by any means, electronic,
mechanical, photocopying, recording, or otherwise, without written permission.
Consulting editor: Masoud Yazdani
Cover design: Mark Lewis
British Library Cataloguing in Publication Data
Sharkey, N. E. (Noel E)
Connectionist natural language processing
I. Title
418
Library of Congress Cataloging-in-Publication Data
Connectionist natural language processing: readings from Connection
science / edited by Noel Sharkey
p. cm.
Includes index
ISBN 978-94-010-5160-6 ISBN 978-94-011-2624-3 (eBook)
DOI 10.1007/978-94-011-2624-3
1. Natural language processing (Computer science) 2. Connection
machines. I. Sharkey, N. E. (Noel E.) II. Connection science.
QA76.9.N38C66 1992
006.3'S--dc20 91-39077
iii
Contents
Preface v
Dedication v
Introduction VI
1 Connectionism and Cognitive Linguistics 1
Catherine L Harris
2 A Connectionist Model of Motion and Government on 28
Chomsky's Government-binding Theory
John Rager & George Berg
3 Syntactic Transformations on Distributed 46
Represen tations
David J Chalmers
4 Syntactic Neural Networks 56
S M Lucas & R I Damper
5 Incremental Syntactic Tree Formation in Human 83
Sentence Processing: a Cognitive Architecture Based
on Activation Decay and Simulated Annealing
Gerard Kempen & Theo Vosse
6 A Hybrid Symbolic/Connectionist Model for Noun 101
Phrase Understanding
Stefan Wermter & Wendy G Lehnert
7 Connectionism and Determinism in a Syntactic 119
Parser
Stan C Kwasny & Kanaan A Faisal
8 A Single Layer Higher Order Neural Net and its 139
Application to Context Free Grammar Recognition
Peter J "yard & Charles Nightingale
iv Contents
9 Connectionist Language Users 163
Robert B Allen
10 Script Recognition with Hierarchical Feature Maps 196
Risto Miikkulainen
11 Learning Distributed Representations of Conceptual 215
Knowledge and their Application to Script-based
Story Processing
Guenbee Lee. Margot Flowers & Michael Dyer
12 A Hybrid Model of Script Generation: or Getting the 24B
Best from Both Worlds
Suzanne M Mannes & Stephanie M Doane
13 Identification of Topical Entities in Discourse: 275
a Connectionist Approach to Attentional Mechanisms
in Language
Lorraine F R Karen
14 The Role of Similarity in Hungarian Vowel Harmony: 295
a Connectionist Account
Mary Hare
15 Representation and Recognition of Temporal Patterns 323
Robert F Port
16 Networks that Learn about Phonological Feature 349
Persistence
Michael Gasser & Chan-Do Lee
17 Pronunciation of Digit Sequences in Text-to-Speech 363
Systems
W A Ainsworth & N P Warren
Index 372
v
Preface
This is a book of readings published in the journal Connection Science
between 1989 and 1991. In these first years in the life of the journal we
received a lot of papers on Natural Language Processing and Cognition.
All of the papers have gone through the normal rigorous journal reviewing
process and thus represent much of the state of the art. There are seventeen
papers in all. Eight of these are from a special issue on natural language
and I have included the editorial from that issue as a general introduction.
It describes the rapid rise of the subject and provides an historical
bibliography up to 1990. The book is laid out roughly in the traditional
categories of language research starting with syntax and moving through
question answering to knowledge application and speech processing.
I would like to thank the editorial board for the Natural Language special
issue for all of their efforts (listed on the page after the editorial). I would
also like to thank Jim Hendler who edited the special issue on Hybrid
Systems (where two of the papers appeared), Paul Day who helped with the
journal generally and has written the book index, and David Green at
Carfax who has been an inspiration and, of course, all of the anonymous
referees who worked hard for little reward. Finally, I must acknowledge
the industrious Lyn Shackleton (editorial assistant) without whom the
whole enterprise would have been a much more laborious task.
Noel Sharkey
Dedication
To my four wonderful aunts: Madge Lundy, Thelma Pringle, Eileen
Burns, and Alice Murray for being there.
vi
Introduction
Connection Science and Natural Language:
an Emerging Discipline
NOEL E. SHARKEY
The journal Connection Science is pleased to present this special issue on Connection
ist Natural Language Processing (CNLP) to mark the coming of age of this new
approach to natural language. CNLP has really only taken off in the last five years.!
Before that, very little CNLP research was actually published. Connectionist parsing
got under way with the 10calist work of Small et al. (1982), and work on distributed
propositional representations in semantic memory was started by Hinton (1981). The
Hinton paper was very influential in pointing to issues on representation that were to
be the motivation for later research (e.g. distributed v. localist representations,
classical v. uniquely connectionist representation, type/token v. part/whole hierar
chies). However, it was not until 1985 that CNLP began to emerge as a field of
enquiry in its own right. That year saw three papers on parsing using quite different
techniques: Fanty (1985) employed localist techniques to context free parsing; Selman
(1985) utilized Boltzmann machine ideas for syntactic parsing; and Waltz & Pollack
(1985) presented the first hybrid system with a connectionist semantic net fronted by a
'symbolic' chart parser. Cottrell's (1985) thesis research, on word sense disambig
uation, also explored the use of connectionist syntactic constraints.
The following year began the line of research inspired by AI theories of Natural
Language Understanding (e.g. Golden, 1986; Lehnert, 1986; Sharkey et aI., 1986).
This was followed closely by the publication of the highly influential two volume PDP
books edited by Rumelhart & McClelland (1986a). These volumes contained a number
of papers relating to aspects of natural language processing such as case assignment
(McClelland & Kawamoto, 1986); learning the past tense of verbs (Rumelhart &
McClelland, 1986b); and reading (McClelland, 1986). Moreover, the two volumes
expanded on some of the representational issues discussed earlier by Hinton (1981).
Since 1986, many more CNLP papers have appeared than is possible to mention
here. Among these was further work on the application of world knowledge on
language understanding (e.g. Dolan & Dyer, 1987; Chun & Mimo, 1987; Sharkey,
1989a; Miikkulainen & Dyer, 1987); and further research on various aspects of syntax
and parsing (e.g. Hanson & Kegl, 1987; Howells, 1988; Benello et al., 1989). In
Introduction vii
addition, we have begun to see a marked increase in the number of topics explored in
CNLP: phrase generation (e.g. Kukich, 1987; Gasser, 1988), qnestion answering
(Allen, 1988), prepositional attachment (e.g. Cosic & Munro, 1988; Sharkey, 1990;
Wermter & Lehnert, 1990; St John & McClelland, 1990), anaphora (Allen & Riecken,
1988), goals and plans (Sharkey, 1988), inference (Lange & Dyer, 1989), variable
binding (Smolensky, 1987), and lexical processing (Kawamoto, 1989; Sharkey, 1989b).
Perhaps the biggest boost to CNLP research came unintentionally from a critique
of the field by Fodor & Pylyshyn (1988). Their aim was to do the same sort of 'hatchet
job' on connectionist language research as Chomsky had done on behaviourist language
research in the 1950s. However, this time the criticisms have prompted an industrious
research campaign to show that unique connectionist representations have the proper
ties necessary to represent natural language in terms of functional compositionality
(van Gelder, 1990); an ability to encode temporal structures (Elman, 1989) and an
ability to encode distributed recursive representations (Pollack, 1990; Smolensky,
1990).
It is clear to those who work in CNLP that the area is expanding rapidly, both in
terms of theory and applications. This is an exciting area, although it is difficult to
keep abreast of the most recent work because it is often published in obscure
conference proceedings. The number of submissions we received for this special issue
shows that the field is very healthy, and some of the best recent work is contained
herein. Nonetheless, we would like to see much more CNLP research published in
Connection Science. The competition is tough but we wholeheartedly welcome research
papers on any area of CNLP. We would particularly like to see more research and
discussion on some of the new representational issues on which the fate of CNLP may
rest in the nineties.
Note
1. This is not counting research on word recognition.
References
Allen, R.B. (1988) Sequential connectionist networks for answering simple questions abou microworld.
Proceedings of the 10th Annual Conference of the Cognitive Society, Montreal.
Allen, R.B. & Riecken, M.E. (1988) Anaphora and reference in connectionist language users. International
Computer Science Conference, Hong Kong.
Benello, J., Mackie, A.W. & Anderson, J.A. (1989) Syntactic category disambiguation with neural networks.
Computer Speech and Language, 3, 203-217.
Chun, H.W. & Mimo, A. (1987) A model of schema selection using marker parsing and connectionist
spreading activation. Proceedings of the 9th Annual Conference of the Cognitive Science Society, Seattle,
WA, pp. 887-896.
Cosic, C. & Munro, P. (1988) Learning to represent and understand locative prepositional phrases. TR
LISOO2/IS88002, School of Library and Information Service, University of Pittsburgh, PA.
Cottrell, G.W. (1985) A connectionist approach to word sense disambiguation. PhD thesis, TR154,
Department of Computer Science, University of Rochester, NY.
Dolan, C.P. & Dyer, M.G. (1987) Symbolic schemata, role binding and the evolution of structure in
connectionist memories. IEEE First International Conference on Neural Networks, San Diego, 21-24 June,
II, pp. 287-298.
Elman, J.L. (1989) Representation and structure in connectionist models. TR 8903, CRL, University of
California, San Diego, CA.
Fanty, M. (1985) Context-free parsing in connectionist networks. University of Rochester, NY, Department
of Computer-Science, Technical Report, TR-174.
Fodor, J.A. & Pylyshyn, Z.W. (1988) Connectionism and cognitive architecture: a critical analysis.
Cognition, 28, 2-71.
viii Introduction
Gasser, M.E. (1988) A connectionist model of sequence generation in afirst and second language. TR UCLA
AI-88-13, AI Lab, Computer Science Deptartment, UCLA, July.
Gelder, T., van (1990) Compositionality: a connectionist variation on a classical theme. Cognitive Science, 14.
Golden, R.M. (1986) Representing causal schemata in connectionist systems. Proceedings of the 8th Annual
Conference of the Cognitive Science Society, pp. 13-21.
Hanson, S.J. & Kegl, J. (1987) PARSNIP: a connectionist network that learns natural language grammar from
exposure to natural language sentences. Proceedings of the 9th Annual Conference of the Cognitive Science
Society, Seattle, WA, pp. 106-119.
Hinton, G.E. (1981) Implementing semantic networks in parallel hardware. In G. E. Hinton &
J. A. Anderson (Eds) Parallel Models of Associative Memory. Hillsdale, NJ: Lawrence Erlbaum.
Howells, T. (1988) VITAL, a connectionist parser. Proceedings of the 10th Annual Conference of the Cognitive
Science Society, Montreal.
Kukich, K. (1987) Where do phrases come from: some preliminary experiments in connectionist phrase
generation. In G. Kempem (Ed.) Natural Language Generation: New Results from Artificial Intelligence,
Psychology and Linguistics. Dordrecht: Kluwer Academic, pp. 405-421.
Lange, T.E. & Dyer, M.G. (1989) High-level inferencing in a connectionist network. Connection Science, 1,
181-217.
Lehnen, W.G. (1986) Possible implications of connectionism. Theoretical Issues in Natural Language
Processing. University of Mexico, pp. 78-83.
Kawamoto, A.H. (1989) Distributed representations of ambiguous words and their resolution in a
connectionist network. In S. L. Small, G. W. Cottrell & M. K. Tnanhaus (Eds) Lexical Ambiguity
Resolution. San Mateo, CA: Morgan Kaufmann.
McClelland, J.L. (1986) Parallel distributed processing and role assigning constraints. Theoretical Issues in
Natural Language Processing, University of New Mexico, pp. 72-77.
McClelland, J.L. & Kawamoto, A.H. (1986) Mechanisms of sentence processing: assigning roles to
constituents. In J. L. McLelland & D. E. Rumelhan (Eds) Parallel Distributed Processing, Vol. 2.
Cambridge, MA: MIT Press.
Miikkulainen, R & Dyer, M.G. (1987) Building distributed representations without microf eatures. Technical
Repon UCLA-AI-87-17, AI Laboratory, Computer Science Depanment, University of California at Los
Angeles, CA.
Pollack, J.B. (1990) Recursive distributed representations. Artificial Intelligence (in press).
Rumelhan, D.E. & McClelland, J.L. (Eds) (1986a) Parallel Distributed Processing, Vols. 1 & 2. Cambridge,
MA: MIT Press.
Rumelhan, D.E. & McClelland, J.L. (1986b) On learning the past tense of verbs. In D. E. Rumelhan &
J. L. McClelland (Eds) Parallel Distributed Processing, Vol. 2, Pyschological and Biological Models.
Cambridge, MA: MIT Press, pp. 216-271.
St John, M.F. & McClelland, J.L. (1990) Learning and applying contextual constraints in sentence
comprehension. In R Reilly & N. E. Sharkey (Eds) Connectionist Approaches to Natural Language
Processing. Hove: Lawrence Erlbaum (in press).
Selman, B. (1985) Rule-based processing in a connectionist system for natural language understanding. TR
CSRI-168, Computer Systems Research Institute, University of Toronto.
Sharkey, N.E. (1988) A PDP system for goal-plan decisions. In R Trappl (Ed.) Cybernetics and Systems.
Dordrecht: Kluwer Academic, pp. 1031-1038.
Sharkey, N.E. (1989a) A PDP learning approach to natural language understanding. In I. Aleksander (Ed.)
Neural Computing Architectures. London: North Oxford Academic.
Sharkey, N.E. (1989b) The lexical distance model and word priming. Proceedings of the Eleventh Cognitive
Science Society Conference.
Sharkey, N.E. (1990) Implementing soft preferences for structural disambiguation. KONNAI (in press).
Sharkey, N.E., Sutcliffe, RF.E. & Wobcke, W.R (1986) Mixing binary and continuous connection schemes
for knowledge access. Proceedings of the American Association for Artificial Intelligence.
Small, S.L., Cottrell, G.W. & Shastri, L. (1982) Towards connectionist parsing. In Proceedings of the
National Conference on Artificial Intelligence, Pittsburgh, PA.
Smolensky, P. (1987) On variable binding and the representation of symbolic structures in connectionist
systems. TR CU-CS-355-87. Depanment of Computer Science, University of Colorado, Boulder, CO.
Smolensky, P. (1990) Tensor product variable binding and the representation of symbolic structures in
connectionist systems. Artificial Intelligence (in press).
Waltz, D.L. & Pollack, J.B. (1985) Massively parallel parsing: a strongly interactive model of natural
language interpretation. Cognitive Science, 9, 51-74.
Wermter, S. & Lehnen, W.G. (1990) Noun phrase analysis with connectionist networks. In R Reilly &
N. E. Sharkey (Eds) Connectionist Approaches to Natural Language Processing. Hove: Lawrence Erlbaum
(in press).
ix
Special Editorial Review Panel
Robert Allen, Bell Communication Research
Garrison W. Cottrell, University of California, San Diego
Michael G. Dyer, University of California, Los Angeles
Jeffrey L. Elman, University of California, San Diego
George Lakoff, University of California, Berkeley
Wendy W. Lehnert, University of Massachusetts, Amherst
Jordan Pollack, Ohio State University
Ronan Reilly, Beckmann Institute, Illinois
Bart Selman, University of Toronto
Paul Smolensky, University of Colorado, Boulder
1
Chapter 1
Connectionism Cognitive Linguistics
~nd
CATHERINE L. HARRIS
Cognitive linguists hypothesize that language is the product ofg eneral cognitive abilities.
Semantic and functional motivations are sought for grammatical patterns, sentence
meaning is viewed as the result of constraint satisfaction, and highly regular linguistic
patterns are thought to be mediated by the same processes as irregular patterns. In this
paper. recent cognitive linguistics arguments emphasizing the schematicity continuum,
the non-autonomy of syntax, and the non-compositionality of semantics are presented
and their amenability to connectionist modeling described. Some of the conceptual
matches between cognitive linguistics and connectionism are then illustrated by a back
propagation model of the diverse meanings of the preposition over. The pattern set
consisted of a distribution of form-meaning pairs that was meant to be evocative of
English usage in that the regularities implicit in the distribution spanned the spectrum
from rules to partial regularities to exceptions. Under pressure to encode these regularities
with limited resources. the network used one hidden layer to recode the inputs into a set
of abstract properties. The properties discovered by the network correspond closely to
semantic features that linguists have proposed when giving an account of the meaning of
over.
KEYWORDS: Connectionism, semantics, syntax, polysemy, lexicon, schemas.
1. Introduction
Over the past decade a small but growing number of papers have argued that solutions
to enduring problems in semantics and grammar will require abandoning the theoreti
cal framework that has dominated linguistic research in the last 25-30 years (Lakoff,
1987a, 1987b; Langacker, 1982, 1986, 1987a, 1988; Bates & MacWhinney, 1982, 1987;
Fauconnier, 1985; Fillmore, 1988; Kuno, 1987; Talmy, 1975, 1983; Givon, 1979).
While the proponents of this refocusing have emphasized different linguistic problems,
they concur in rejecting the two major tenets of Chomskyan linguistics: the separ
ateness and specialness of language (Chomsky's hypothesized 'innate mental organ';
Chomsky, 1980) and the modularity of different types of linguistic information
(syntax, semantics, morphology, phonology).
In this new framework, language is viewed as a product of cognitive processes.
Researchers in cognitive linguistics have sought to show that neither the form nor the
Catherine L. Harris, Department of Cognitive Science, 0-015, University of California, La Jolla, CA 92093,
USA; Email: harris@cogsci.ucsd.edu; Tel: (619) 534-4348. This work was supported in part by an NSF
graduate fellowship to the author. The author thanks Farrel Ackerman, Ken Baldwin, Elizabeth Bates, George
Lakoff, David Touretzky and Cyma Van Petten for assistance on this project.