Springer Series in Information Sciences 1 Springer Series in Information Sciences Editors: Thomas S. Huang Manfred R. Schroeder Volume 1 Content-Addressable Memories By T. Kohonen 2nd Edition Volume 2 Fast Fourier Transform and Convolution Algorithms By H. J. Nussbaumer 2nd Edition Volume 3 Pitch Determination of Speech Signals Algorithms and Devices ByW. Hess Volume 4 Pattern Analysis By H. Niemann Volume 5 Image Sequence Analysis Editor: T. S. Huang Volume 6 Picture Engineering Editors: King-sun Fu and T. L. Kunii Volume 7 Number Theory in Science and Communication With Applications in Cryptography, Physics, Digital Information, Computing, and Self-Similarity By M. R. Schroeder 2nd Edition Volume 8 Self-Organization and Associative Memory By T. Kohonen Volume 9 Digital Picture Processing An Introduction By L. P. Yaroslavsky Volume 10 Probability, Statistical Optics, and Data Testing A Problem Solving Approach By B. R. Frieden Volume 11 Physical and Biological Processing of Images Editors: O. J. Braddick and A. C. Sleigh Volume 12 Multiresolution Image Processing and Analysis Editor: A. Rosenfeld Volume 13 VLSI for Pattern Recognition and Image Processing Editor: King-sun Fu Volume 14 Mathematics of Kalman-Bucy Filtering By P. A. Ruymgaart and T. T. Soong Volume 15 Fundamentals of Electronic Imaging Systems Some Aspects of Image Processing By W. F. Schreiber Volume 16 Introduction to Statistical Radiophysics and Optics I Random Oscillations and Waves By S. A. Akhmanov, Y. Y. Dyakov, and A. S. Chirkin Teuvo Kohonen Content-Addressable Memories Second Edition With 123 Figures Springer-Verlag Berlin Heidelberg New York London Paris Tokyo Professor Teuvo Kohonen Department of Technical Physics, Helsinki University of Technology SF-02150 Espoo 15, Finland Series Editors: Professor Thomas S. Huang School of Electrical Engineering, Purdue University West Lafayette, IN 47907, USA Professor Dr. Manfred R. Schroeder Drittes Physikalisches Institut, Universitiit Giittingen, Biirgerstrasse 42-44, D-3400 Giittingen, Fed. Rep. of Germany ISBN-13: 978-3-540-17625-1 e-ISBN-13: 978-3-642-83056-3 DOl: 10.1007/978-3-642-83056-3 Library of Congress Cataloging in Publication Data. Kohonen, Teuvo, Content·addressable memories. (Springer series in information sciences; 1) Bibliography: p. Includes index. 1. Associative storage. 2. Information storage and retrieval systems. I. Title. II. Series. TK7895.M4K63 1987 004.5 87-4765 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in other ways, and storage in data banks. Duplication of this publication or parts thereof is only permitted under the previsions of the German Copyright Law of September 9, 1965, in its version of June 24, 1985, and a copyright fee must always be paid. Violations fall under the prosecution act of the German Copyright Law. © Springer·Veriag Berlin Heidelberg 1980 and 1987 Softcover reprint of the hardcover 2nd edition 1987 The use of registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Offset printing and bookbinding: Briihlsche" Universitiitsdruckerei, Giessen 2153/3150-543210 Preface to the Second Edition Due to continual progress in the large-scale integration of semiconductor circuits, parallel computing principles can already be met in low-cost sys tems: numerous examples exist in image processing, for which special hard ware is implementable with quite modest resources even by nonprofessional designers. Principles of content addressing, if thoroughly understood, can thereby be applied effectively using standard components. On the other hand, mass storage based on associative principles still exists only in the long term plans of computer technologists. This situation is somewhat confused by the fact that certain expectations are held for the development of new storage media such as optical memories and "spin glasses" (metal alloys with low-density magnetic impurities). Their technologies, however, may not ripen until after "fifth generation" computers have been built. It seems that software methods for content addressing, especially those based on hash coding principles, are still holding their position firmly, and a few innovations have been developed recently. As they need no special hardware, one might expect that they will spread to a wide circle of users. This monograph is based on an extensive literature survey, most of which was published in the First Edition. I have added Chap.?, which contains a review of more recent work. This updated book now has references to over 1200 original publications. In the editing of the new material, I received valuable help from Anneli HeimbUrger, M.Sc., and Mrs. Leila Koivisto. Otaniemi, Finland Teuvo Kohonen February, 198? Preface to the First Edition Designers and users of computer systems have long been aware of the fact that inclusion of some kind of content-addressable or "associative" functions in the storage and retrieval mechanisms would allow a more effective and straightforward organization of data than with the usual addressed memories, with the result that the computing power would be significantly increased. However, although the basic principles of content-addressing have been known for over twenty years, the hardware content-addressable memories (CAMs) have found their way only to special roles such as small buffer memories and con trol units. This situation now seems to be changing: Because of the develop ment of new technologies such as very-large-scale integration of semiconduc tor circuits, charge-coupled devices, magnetic-bubble memories, and certain devices based on quantum-mechanical effects, an increasing amount of active searching functions can be transferred to memory units. The prices of the more complex memory components which earlier were too high to allow the application of these principles to mass memories will be reduced to a fraction of the to tal system costs, and this will certainly have a significant impact on the new computer architectures. In order to advance the new memory principles and technologies, more in formation ought to be made accessible to a common user. To date it has been extremely difficult to gain an overview of content-addressable memories; dur ing the course of their development many different principles have been tried, and many electronic technologies on which these solutions have been based have become obsolete. More than one thousand papers have been published on content addressing, but this material has never been available in book form. Numerous difficulties have also been caused by the fact that many developments have been classified for long periods of time, and unfortunately there still exists material which is unreachable for a common user. The main purpose of this book has been to overcome these difficulties by presenting most of the relevant results in a systematic form, including comments concerning their practical applicability and future development. VIII The organization of material in this book has been based on a particular logic: the text commences with basic concepts, and the software methods are presented first because they are expected to be of interest to all who have an access to general-purpose computers. Computer designers who are concerned with the intimate hardware, too, will be interested in the logic principles and electronic implementations of CAM circuits given in the next chapters; the highly parallel content-addressable computers which may concern only a few specialists are reviewed last. Nonetheless it may be envisioned that future computer systems will increasingly acquire these principles so that the cir cle of users becoming acquainted with these devices will widen. One may also notice that content addressing has been understood in this book in a more ge neral way than usual. Although the term "content-addressable memory" in com puter engineering has usually been restricted to referring to certain hard ware constructs only, nonetheless there have existed pure software solutions, e.g., the extremely fast searching method named hash coding which for equally good reasons ought also to be called content-addressing. On the other hand, while the implementation of semantic and relational data structures has usual ly been discussed in the context of software searching methods, essentially the same prinCiples are directly amenable to implementation by special hard ware. Such hardware is already being developed for hash coding, and thus the boundary between the hardware and software approaches is disappearing. It was therefore felt that both of these approaches should be discussed in the same volume, in order to facilitate their comparison and to make the issues in favor of one or the other more clear. Chapter 2 which deals with hash coding has been made as complete as possible. This book is an outgrowth of courses given at the Helsinki University of Technology, as well as research work pursued on associative memory under the auspices of the Academy of Finland. I am grateful for all material help ob tained from the above institutions. The assistance of the following persons should be appreciated: Mrs. Pirjo Teittinen helped in the collection of lit erature and typed out my manuscripts. Mr. Erkki Reuhkala contributed to the compilation of material on hash coding. Mrs. Rauha Tapanainen, as well as Mr. Heikki Riittinen, prepared the drawings in their final form. Mrs. Maija-Liisa Hylkila typed out the Subject Index. Proofreading of the complete book was done by Mrs. Hylkila as well as Dr. [rkki Oja; parts of it were checked by other members of our Laboratory, too. Otaniemi, Finland Teuvo Kohonen November, 1979 Contents Chapter 1 Associative Memory. Content Addressing. and Associative RecaLL 1.1 Introduction....................................................... 1 1.1.1 Various Motives for the Development of Content- Addressable Memories........................................ 2 1.1.2 Definitions and Explanations of Some Basic Concepts ......... 3 1.2 The Two Basic Implementations of Content Addressing ................ 6 1.2.1 Software Implementation: Hash Coding........................ 6 1.2.2 Hardware Implementation: The CAM ............................ 9 1. 3 Associ a ti ons ......•................................................ 10 1.3.1 Representation and Retrieval of Associated Items ............ 10 1.3.2 Structures of Associations .................................. 14 1.4 Associative Recall: Extensions of Concepts ......................... 17 1.4.1 The Classical Laws of Association .••........................ 18 1.4.2 Similarity Measures .......................•................. 19 1.4.3 The Problem of Infinite Memory .............................. 28 1.4.4 Distributed Memory and Optimal Associative Mappings ......... 29 1.4.5 Sequential Recollections .................................... 33 Chapter 2 Content Addressing by Software 2.1 Hash Coding and Formatted Data Structures .......................... 39 2.2 Hashing Functions ..........................•....................... 43 2.3 Handling of Collisions ................•............................ 56 2.3.1 Some Bas i c Concepts ...•..•.................................. 56 2.3.2 Open Addressing ............................................. 58 2.3.3 Chaining (Coalesced) ........................................ 66 2.3.4 Chaininq Through a Separate Overflow Area .•................. 68 x 2.3.5 Rehashing.................................................. 70 2.3.6 Shortcut Methods for Speedup of Searching .................. 70 2.4 Organizational Features and Formats of Hash Tables ................ 72 2.4.1 Direct and Indirect Addressing ............................. 72 2.4.2 Basic Formats of Hash Tables............................... 75 2.4.3 An Example of Special Hash Table Organization .............. 79 2.5 Evaluation of Different Schemes in Hash Coding.................... 83 2.5.1 Average Length of Search with Different Collision Hand 1i ng Methods ........................................... 84 2.5.2 Effect of Hashing Function on the Length of Search 95 2.5.3 Special Considerations for the Case in Which the Search Is Unsuccessful ...................•................. 97 2.6 Mul ti -Key Search ...........................................•...... 99 2.6.1 Lists and List Structures .................................. 100 2.6.2 An Example of Implementation of Multi-Key Search by Hash Index Tables ....................................... 106 2.6.3 The Use of Compound Keywords in Hashing .................... 109 2.7 Implementation of Proximity Search by Hash Coding ................. 111 2.8 The TRIE Memory ................................................... 120 2.9 Survey of Literature on Hash Coding and Related Topics ............ 123 Chapter 3 Logic Principles of Content-Addressable Memories 3.1 Present-Day Needs for Hardware CAMs ............................... 125 3.2 The Logic of Comparison Operations ................................ 128 3.3 The All-Parallel CAM .............................................. 133 3.3.1 Circuit Logic of a CAM Bit Cell ............................ 133 3.3.2 Handling of Responses from the CAM Array ................... 136 3.3.3 The Complete CAM Organization .............................. 142 3.3.4 Magnitude Search with the All-Parallel CAM ................. 144 3.4 The Word-Parallel, Bit-Serial CAM ................................. 146 3.4.1 Implementation of the CAM by the Linear-Select Memory Principle ........................................... 146 3.4.2 Skew Addressing ............................................ 150 3.4.3 Shift Register Implementation .............................. 153 XI 3.4.4 The Results Storage ....................................... 154 3.4.5 Searching on More Complex Specifications .................. 158 3.5 The Word-Serial, Bit-Parallel CMl ................................ 161 3.6 Byte-Serial Content-Addressable Search ........................... 164 3.6.1 Coding by the Characters .................................. 164 3.6.2 Specifications Used in Document Retrieval ................. 165 3.6.3 A Record-Parallel, Byte-Serial CAM for Document Retrieval ................................................. 166 3.7 Functional Memories .............................................. 172 3.7.1 The Logic of the Bit Cell in the FM ....................... 173 3.7.2 Functional Memory 1 ....................................... 174 3.7.3 Functional Memory 2 ................•...................... 178 3.7.4 Read-Only Functional Memory ............................... 182 3.8 A Formalism for the Description of Micro-Operations in the CAM ....................................................... 184 3.9 Survey of Literature on CAMs ..................................... 189 Chapter 4 CAM HardJuare 4.1 The State-of-the-Art of the Electronic CAM Devices ............... 191 4.2 Circuits for All-Parallel CAMs ................................... 193 4.2.1 Active Electronic Circuits for CAM Bit Cells .............. 193 4.2.2 Cryotron-Element CAMs ..................................... 203 4.2.3 Josephson Junctions and SQUIDs for Memories ............... 205 4.3 Circuits for Bit-Serial and Word-Serial CAMs ..................... 210 4.3.1 Semiconductor RA~l Modules for the CAM ..................... 211 4.3.2 Magnetic Memory Implementations of the CAM ................ 212 4.3.3 Shift Registers for Content-Addressable Memory ............ 215 4.3.4 The Charge-Coupled Device (CCD) ........................... 217 4.3.5 The lv1agnetic-Bubble Memory U1BM) .......................... 220 4.4 Optical Content-Addressable Memories ............................. 227 4.4.1 Magneto-Optical Memories .................................. 229 4.4.2 Holographic Content-Addressable ~~mories .................. 231
Description: