Table Of Content

INDEX DATA STRUCTURES IN OBJECT-ORIENTED DATABASES The Kluwer International Series on ADVANCES IN DATABASE SYSTEMS Series Editor Ahmed K. Elmagarmid Purdue University West Lafayette. IN 47907 Other books in the Series: DATABASE CONCURRENCY CONTROL: Methods, Performance, and Analysis by Alexander Thomasian ISBN: 0-7923-9741-X TIME-CONSTRAINED TRANSACTION MANAGEMENT Real-Time Constraints in Database Transaction Systems by Nandit R. Soparkar. Henry F. Korth. Abraham Silberschatz ISBN: 0-7923-9752-5 SEARCHING MULTIMEDIA DATABASES BY CONTENT by Christos Faloutsos ISBN: 0-7923-9777-0 REPLICATION TECHNIQUES IN DISTRIBUTED SYSTEMS by Abdelsalam A. Helal. Abdelsalam A. Heddaya. Bharat B. Bhargava ISBN: 0-7923-9800-9 VIDEO DATABASE SYSTEMS: Issues, Products, and Applications by Ahmed K. Elmagarmid. Haitao Jiang. Abdelsalam A. Helal. Anupam Joshi. Magdy Ahmed ISBN: 0-7923-9872-6 DATABASE ISSUES IN GEOGRAPHIC INFORMATION SYSTEMS by Nabil R. Adam and Aryya Gangopadhyay ISBN: 0-7923-9924-2 The K1uwer International Series on Advances in Database Systems addresses the following goals: • To publish thorough and cohesive overviews of advanced topics in database systems. • To publish works which are larger in scope than survey articles, and which will contain more detailed background infonnation. • To provide a single point coverage of advanced and timely topics. • To provide a forum for a topic of study by many researchers that may not yet have reached a stage of maturity to warrant a comprehensive textbook. INDEX DATA STRUCTURES IN OBJECT-ORIENTED DATABASES by Thomas A. MUECK Martin L. POLASCHEK Universitat Wien Vienna, Austria . ., ~ SPRINGER SCIENCE+BUSINESS MEDIA, LLC ISBN 978-1-4613-7849-5 ISBN 978-1-4615-6213-9 (eBook) DOI 10.1007/978-1-4615-6213-9 Library of Congress Cataloging-in-Publication Data A C.I.P. Catalogue record for this book is available from the Library of Congress. Copyright © 1997 bY' Springer Science+Business Media New York Originally published by Kluwer Academic Publishers. New York in 1997 Softcover reprint of the hardcover 1st edition 1997 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, mechanical, photo copying, recording, or otherwise, without the prior written permission of the publisher, Springer Science+Business Media, LLC. Printed on acid-free paper. CONTENTS Preface VB 1 INTRODUCTION 1 1.1 Object-oriented databases and indexing 2 1.2 Application aspects 5 2 DATABASE MODEL 7 2.1 Object Model 9 2.2 Query language issues 23 2.3 Bibliography 28 3 DATA STRUCTURES AND INDEXING 29 3.1 Basics 29 3.2 A systematic approach 52 3.3 One-dimensional search data structures 61 3.4 Multi-dimensional Search Data Structures 71 3.5 Bibliography 83 4 TYPE HIERARCHY INDEXING 85 4.1 Problem description 85 4.2 Type grouping 89 4.3 Key grouping 96 4.4 Multikey type index 106 4.5 Bibliography 120 5 AGGREGATION PATH INDEXING 123 5.1 Problem description 123 5.2 Path decomposition schemes 129 5.3 Bibliography 139 VI INDEX DATA STRUCTURES IN OODB 6 COLLECTION OPERATIONS 141 6.1 Problem description 141 6.2 Signature files for indexing multi-valued properties 146 6.3 Bibliography 150 7 PERFORMANCE ANALYSIS - AN EXAMPLE 151 7.1 Storage space requirements 152 7.2 Query performance 159 REFERENCES 165 INDEX 175 PREFACE Object-oriented database management systems (OODBMS) are used to imple ment and maintain large object databases on persistent storage. Regardless whether the underlying database model follows the object-oriented, the rela tional or the object-relational paradigm, a key feature of any DBMS product is content based access to data sets. On the one hand this feature provides user-friendly query interfaces based on predicates to describe the desired data. On the other hand it poses challenging questions regarding DBMS design and implementation as well as the application development process on top of the DBMS. The reason for the latter is that the actual query performance depends on a technically meaningful use of access support mechanisms. In particular, if chosen and applied properly, such a mechanism speeds up the execution of predicate based queries. In the object-oriented world, such queries may involve arbitrarily complex terms referring to inheritance hierarchies and aggregation paths. These features are attractive at the application level, however, they increase the complexity of appropriate access support mechanisms which are known to be technically non-trivial in the relational world. In the field of databases and database management systems, such an access support mechanism for improved query performance relies on one or more un derlying search data structures and is usually called index. Informally, the central idea behind this kind of data structure application is to find the identi fiers of all objects fulfilling a given query predicate without reading the objects from disk. The practical benefit of indexing large persistent object sets is there fore a significant reduction in the number of disk I/O operations thus yielding a performance gain. The purpose of this book is to provide technical information about current and future issues of search data structures used to index large object-oriented databases. The intended audience of this book includes all kinds of practitioners involved in OODBMS product selection, application dependent database per formance tuning and application development on top of object databases as well as researchers and students interested in the technical issues of object-oriented viii INDEX DATA STRUCTURES IN OODB databases. The only prerequisite for understanding the material presented in this book is a working knowledge object-oriented modeling and programming concepts and a minimum knowledge of algebraic concepts like for example sets. After the introduction two preparatory chapters present the underlying data base model as outlined in the ODMG-93 [Cat96] proposal on the one hand and a chapter elaborating on the technical issues of search data structures and their use for indexing large data sets on the other hand. The three subsequent chap ters deal with major indexing topics in object-oriented databases, in particular, type hierarchy indexing, aggregation path indexing, and speedup of collection operations. The presentation is concluded with a performance analysis example in the field of type hierarchy indexing. A related issue not covered in this book is physical object clustering or, in other words, the mapping of object identifiers to physical storage addresses. Decoupling support for content based access from physical object management and, therefore, the indexing component from an OODBMS's persistent object store provides a high degree of flexibility for both application programmers and system developers. Therefore the issues in the context of object clustering form a separate research domain beyond the scope of this book. Details about the indexing components of particular OODBMS products have been omitted from this book for two reasons. At first, it is hardly possible to get detailed technical information on the indexing components from vendors and secondly, even if this kind of information could be obtained, it is quickly dated. So it seems to be more appropriate to describe the technical issues and solutions in this field and help the reader in this way to decide about products in presence of timely and hopefully detailed information. Acknowledgments We thank our colleagues at the Abteilung fiir Data Engineering, Universitat Wien, for hours of fruitful discussions and in particular our former room mate Erich Schikuta for introducing us to the versatile field of search data structures in the early days. Also, we would like to thank Professor Ahmed K. Elmagarmid for supporting this book project. This book would not exist without the continuing encouragement by the peo ple at Kluwer Academic Publishers, especially by Scott Delman and his staff. Special thanks for being patient. 1 INTRODUCTION One of the most important features of database management systems is as sociative or, in other words, content based access to large data sets held on persistent storage media like magnetic disks. In terms of the object-oriented model, an OODBMS provides access to stored objects not only by using object references1 but also by using query predicates for particular object properties. The basic requirements to be met by any content based retrieval mechanism are well-known, for example from database management systems based on the relational model. However, due to the se mantic richness and the expressive power of the object-oriented model, there is an even larger manifold of content based access patterns to be used in object oriented query specifications. Important examples (see also Figure 1.1 for a simple database schema) for object-oriented modeling features yielding characteristic access patterns in the context of OODBMS are: • properties of a particular object type which are implicitly also part of all subtypes by using the mechanism of type inheritance, for example, the property income belongs not only to object type Person but also to types Student and Staff, • links between different object types (also known as relationships) like worksln relating type Staff to type Department by means of a so called aggregation path and lTechnically, by data items referring to stored objects using an address or a unique object identifier used to abstract from addresses T. A. Mueck et al., Index Data Structures in Object-Oriented Databases © Kluwer Academic Publishers 1997 2 CHAPTER 1 • instance variables corresponding to multi-valued properties, for example, the property hobbies of type Person used to store a set of hobbies per object of type Person. Float Figure 1.1 Simple object-oriented database schema The implications of these features for the processing of the resulting query predicates are briefly discussed below. 1.1 OBJECT-ORIENTED DATABASES AND INDEXING In each of the above mentioned cases, the performance characteristics of content based queries can be enhanced by an access support mechanism, or in other words, by an index which uses search data structures to locate all the qualified data. What is the reason for this enhancement and the central idea behind access sup port mechanisms in object-oriented databases? To put it into a nutshell: when processing content based queries without any indexing support, an OODBMS has to fetch all objects belonging to the referenced types from mass storage in order to determine the objects fulfilling the predicates. Informally, the advantage of an index is that the OODBMS is able to locate qualifying objects without fetching all objects of the appropriate type from disk therefore providing associative or content based access to object sets. However, the disadvantage is a certain amount of maintenance overhead incurred by each index, basically one or more search data structures which have to be updated whenever the corresponding object set is updated. So far, there is not much difference to indexing in relational database systems. Now, what is so special with indexing in object database systems? Generally

Index Data Structures in Object-Oriented Databases PDF

185 Pages·1997·6.306 MB·English

by Thomas A. Mueck, Martin L. Polaschek (auth.)

Checking for file health...

Save to my drive

Quick download

Download

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Index Data Structures in Object-Oriented Databases

See more

The list of books you might like

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.