Table Of ContentEntity-Relationship Modeling
Springer-Verlag Berlin Heidelberg GmbH
Bernhard Thalheim
Entity-Relationship
Modeling
Foundations of Database Technology
With 160 Figures
Springer
Professor Dr. Bernhard Thalheim
Department of Computer Science
Brandenburg University of Technology at Cottbus
Universitatsplatz 2-4
03044 Cottbus, Germany
thalheim@informatik.tu-cottbus.de
Library of Congress Cataloging-in-Publication Data
Thalheim, B. (Bernhard), 1952-
Entity-relationship modeling: foundations of database technology/
Bernhard Thalheim.
p. em.
Includes bibliographical references and index.
ISBN 978-3-642-08480-5 ISBN 978-3-662-04058-4 (eBook)
DOI 10.1007/978-3-662-04058-4
1. Database management. 2. Database design. I. Title
QA76.9.D3 T4443 2000
005.74-dc21 99-051358
ACM Computing Classification (1998):
H.l-2, D.2-3, G.2.0, F.4.1, F.l.l, 1.2.4, H.4.0, K.6.3-4
ISBN 978-3-642-08480-5
This work is subject to copyright. All rights are reserved, whether the whole or part of the
material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data
banks. Duplication of this publication or parts thereof is permitted only under the pro
visions of the German Copyright Law of September 9, 1965, in its current version, and
permission for use must always be obtained from Springer-Verlag Berlin Heidelberg GmbH.
Violations are liable for prosecution under the German Copyright Law.
© Springer-Verlag Berlin Heidelberg 2000
Originally published by Springer-Verlag Berlin Heidelberg New York in 2000
Softcover reprint of the hardcover 1st edition 2000
The use of general descriptive names, trademarks, etc. in this publication does not imply,
even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
Cover Design: Kiinkel + Lopka, Werbeagentur, Heidelberg
Typesetting: Camera ready by the author
SPIN 10552740-Printed on acid-free paper-45/3142SR-54 3 2 1 0
Preface
What you inherit from your father,
Ean1 it anew before you call it yours.
What does not serve you is a heavy burden,
What issues from the moment is alone of use.
Goethe, Faust I, Night 1
Before a database system can be used the system needs to be designed.
Database design is the process of determining the organization of a database
application. For database design the structural, semantic and operational
information of an application is acquired and used in order to derive effi
cient management of the database. In addition, the organizational informa
tion needs to be considered. This design task is similar to task solving. Given
a task, we develop a language for representation of the task and for describ
ing the derived solution. We also know the complexity of the language and
of the solution space. This knowledge of the complexity of the language can
be used to search for simpler solutions. In this case, optimization of both
the task and the solution derivation is necessary. This optimization can be
based on structures which are easy to operate. For a long time, database
theory dealt with such structural prerequisites for optimization. One vehicle
for deriving simple structures is normalization. However, this is not sufficient.
Since a database system is used by a variety of users who have a diverging
set of operational requirements, the design of efficient database systems be
comes very difficult. The variety of users implies that the database structure,
operations and maintenance should be simple and feasible for 'normal' users.
Especially for large and complex databases, the development of understand
able structures becomes the crucial factor with regard to later use of the
database. Therefore, a database model which is understandable and compre
hensible has to be developed. The relational database model was the first
powerful and theoretically based model. However, in the relational model,
complex database schemes were not feasible and were seldom understand
able. In addition, the representation of semantic information required high
abstraction abilities. Understanding this situation, Peter P. S. Chen devel
oped the entity-relationship (ER) model. The latter is based on a simple
graphical representation. Using this diagrammatic technique, even complex
1 Translations by Peter Salm (Faust I, Bantam Books, 1962) and Philip Wayne
(Faust II, Penguin Books Ltd, 1959).
VI Preface
schemes can be understood. This model has been so successful that it is used
at present in many branches of computer science, even in software engineer
ing. Many different extensions and dialects have been developed although the
theoretical basis of the ER model was not well developed. Different semantics
have been used in different books, articles, and systems and there are even
books with changing semantics. Therefore, there is an urgent need for an
overview of the theoretical background of the ER model.
This book makes an attempt to bridge the gap between the ER the
ory and the ER application. The main aim is to provide a theoretical ba
sis for database design. This basis is used for the development of an effec
tive methodology. Database design methodologies should facilitate database
modeling, effectively support database processing and transform the con
ceptual schema of the database to a high-performance database schema in
the database management system (DBMS) currently in use. Since the late
1970's, various methodologies for database design have been introduced. Most
of them, however, are dependent on the knowledge, comprehension and ex
perience of the database analyst and his or her knowledge of normalization
theory. The proposed methodology does not require the user to understand
the theory, the implementational restrictions or the programming problems
in order to design a database scheme. Thus even an inexperienced designer
could create a database design successfully using this method. The ER model
is extended to the Higher-order Entity-Relationship Model (HERM) by re
lationships of higher degrees and relationships of relationships. The latter
model is used for high-level database design systems. It supports an efficient
translation to nested relational, relational, network and hierarchical schemes.
The model has the expressive power of semantic models and possesses the
simplicity of the ER model.
Cottbus, January 2000 Bernhard Thalheim
Acknowledgements
Dedicated to Klaus f1 Konrad f1 Valeria.
f3 (B.T.)
This book is a record of research and discussions over the last ten years. It
is distilled from courses taught at universities and during conferences and
summer schools. It was enjoyable to write, particularly because it was a joint
effort with many interesting people, who I would now like to thank.
Firstly, I thank my family; without their uncritical support nothing would
have been possible.
I have benefited enormously from the comments made by early readers at
the universities of Budapest, Cottbus, Dresden, Hamburg, Kuwait, Moscow
and Rostock. I distinctly thank:
Sabah Al-Fedaghi, Peter Bachmann, Catriel Beeri, Peter P. Chen, Janos
Demetrovics, Klaus-Dieter Schewe, Joachim W. Schmidt.
I would also like to thank my assistants and students in Cottbus, Dresden,
Kuwait and Rostock who read various portions of the manuscript at various
stages and questioned the proposed solutions.
There is not enough space to thank all the colleagues who supported me
by providing additional information, sending (unpublished) manuscripts and
guiding me through the mass of database literature. Most of them are 'only'
cited. I am very thankful to the ER, MFDBS and ICDT communities. Many
insights discussed here can be traced back to discussions with colleagues from
these communities.
I am thankful to Edith Buchholz, Karla Kersten, Samantha Lane, Wendy
Ran (Springer) and David Reinhardt for proofreading.
Lastly, I thank my editors Hans Wossner and Ingeborg Mayer for putting
up with me. Somehow they endured with patience, diligence, and the hun
dreds and hundreds of days this book to get from prospectus to publication.
Human beings do their best work by iteration, and this book was no ex
ception. The first version of it was ready in 1992. This was the time the
extensive Springer support began. I am very grateful for this support, many
helpful suggestions, comments, corrections, criticism and assistance that sig
nificantly helped shape the contents of this work.
The book will never be finished, just published. With regard to success I
would like to thank to readers for acquiring this compendium. I hope that it
is useful to you and that you enjoy it as much as I enjoyed writing it. It can
VIII Acknowledgements
be used in undergraduate, graduate and research courses. In order to support
teaching additional material including slides can be found at:
http:/ jwww .informatik.tu-cottbus.de/" 'thalheim/HERM.htm
The compendium includes research on the ER model, on extensions of the
ER model, on syntax, semantics and pragmatics of database modeling. The
extensions allow an integrated development and modeling of database ap
plications. Thus, this model allows the specification of structures, behavior
and interaction. The achievements of research on ER modeling are given in a
survey aiming in completeness and correctness. Please address any comments
(and, therefore, force me to write another more correct and complete version)
to:
thalheim @ informatik.tu-cottbus.de
Contents
Notation Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XI
1. Introduction . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Database Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Overview of the Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Advantages of the Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2. The Database Design Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1 Databases and Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Design Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Objects To Be Modeled. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3. The Entity-Relationship Model.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.1 The Structural Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4. Extending the Entity-Relationship Model . . . . . . . . . . . . . . . . . . . . . 55
4.1 Overview of Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.2 Entity and Relationship Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.3 Representing Structures of Applications Through HERM Diagrams 73
4.4 Benefits of the HERM Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.5 Extending the Model To Make It More Meaningful. . . . . . . . . . . . . . 93
5. Integrity Constraints .......................................... 105
5.1 Logical Foundations of Semantics ............................. 111
5.2 Generalizing Relational Constraints ........................... 122
5.3 Entity-Relationship Constraints .............................. 163
5.4 Graphical Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
5.5 Incompleteness During Semantics Acquisition . . . . . . . . . . . . . . . . . . 208
5.6 Constraints in Models with Richer Type Systems ............... 214
6. Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
6.1 ER Languages ............................................... 219
6.2 The HERM Algebra ......................................... 223
6.3 Query-By-Example ......................................... 245
6.4 The Logical Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
6.5 Query Forms ............................................... 253
6.6 Behavior Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
X Contents
7. Behavior Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
7.1 Dynamic and Temporal Integrity Constraints. . . . . . . . . . . . . . . . . . . 263
7.2 Dynamic Semantics ......................................... 271
7.3 Axiomatic Approach to Behavior Modeling. . . . . . . . . . . . . . . . . . . . . 289
7.4 Enforcing Integrity Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
7.5 Soft Constraints and Deontic Logics ........................... 308
8. Applying the Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
8.1 Identification in ER Schemata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
8.2 Classifying Binary, Is-A and Subobject Relationship Types ....... 323
8.3 Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
8.4 Equivalence in HERM Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
8.5 Normalization .............................................. 392
9. The Database Design Process in HERM. . . . . . . . . . . . . . . . . . . . . . . 425
9.1 Database Design Methodologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
9.2 Classical Database Design Methodologies ...................... 434
9.3 The HERM Methodology Based on Modular Design. . . . . . . . . . . . . 445
9.4 Codesign of Structures, Functions and Interaction. . . . . . . . . . . . . . . 462
9.5 Natural Language-Based Database Design ..................... 478
9.6 Computer-Aided Database Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488
10. The Translation of HERM Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . 503
10.1 The Relational Representation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503
10.2 The Network and Hierarchical Representations ................. 533
10.3 Reverse Engineering of Translated Schemes .................... 545
11. Open Problems ................................................ 551
References ................................................................. 559
List of Figures ............................................................ 615
Index ....................................................................... 619