ebook img

The Concurrency Control Problem for Database Systems PDF

183 Pages·1981·2.155 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview The Concurrency Control Problem for Database Systems

Lecture Notes ni Computer Science Edited by .G Goos and .J Hartmanis 611 I ocraM Antonio avonasaC ehT ycnerrucnoC lortnoC melborP for esabataD smetsyS I I IIII galreV-regnirpS Berlin Heidelberg kroYweN 1891 Editorial Board W. Brauer P. Brinch Hansen D. Gries C. Moler G. Seegm~iller J. Stoer N. Wirth Author Marco Antonio Casanova Departarnento de Informdtica, Pontiffcia Universidade Cat61ica do R.I Rua Marques de S. Vicente, 209 22.453, Rio de Janeiro, R ,J Brasil CR Subject Classifications (1981): 3.7, 4.3, 4.9 ISBN 3-540-10845-9 Spdnger-Verlag Berlin Heidelberg New York ISBN 0-38?-10845-9 Springer-Verlag New York Heidelberg Berlin All rights copyright. to subject work is This era ,devreser whole the whether or part of the lairetam those specifically concerned, is of re-use reprinting, translation, of illustrations, ,gnitsacdaorb machine photocopying by reproduction or similar ,snaem storage and in data .sknab § 54 Under of the namreG Copyright waL copies where era edam for other than to is payable fee a use, private tfahcsllesegsgnutrewreV" Munich. Wort", © yb galreV-regnirpS Heidelberg Berlin 1891 in Printed ynamreG Printing dna Offsetdruck, Beltz binding: .rtsgreB/hcabsmeH 012345-041315412 PREFACE This monograph investigates the problem of avoiding synchronization anomalies in database systems within two broad scenarios. Part I considers general purpose database systems supporting any transaction mix accessing any database. Solutions to this problem are described by schedulers arbitrating access to the database. A model of schedulers is defined capturing the dynamic acquisition of information about transac- tions that characterize general purpose database systems. The absence of synchronization anomalies is then linked to restrictions on the sequences of accesses to the database (logs) by necessary and sufficient conditions. The construction of schedulers based on log restrictions is discussed, with emphasis on the tradeoffs between the amount of informa- tion at hand, the level of concurrency and the ability to restart opera- tions. Finally, the prototype of a scheduler is presented that uses a log restriction weaker than any previously known practical scheduler. Part II concentrates on update-intensive transaction systems characterized by a known set of transactions accessing a fixed database. Tools and techniques are introduced to describe special synchronization strategies, state their correctness criteria and verify their adequacy. Namely, a data description language and a data manipulation language (DML) for rela- tion~l databases are defined and equipped w~th appropriate logics. Both languages allow for the full use of aggregation operators and the DML includes simple and elegant constructs to describe concurrent computations and synchronization. A consistent and arithmetically complete axiom sys- tem for the DML is presented, which extends that of First-Order Dynamic Logic. Finally, the problem of proving serializability of transaction systems is discussed. Two heuristics facilitating this task are introduced. The first one handles any transaction system and, moreover, induces a method of synthesizing synchronization code. The second one was designed to take advantage of the structure of a special type of transaction systems. TABLE OF CONTENTS PRefACE TABLE OF CONTENTS 1. INTRODUCTION °°°°°°,°~,,°°°°°°°°°,°.°°°°,°.°°°° ...... °°°°°°°,~°°°°. I 1°1 Statement of the Problem ..................................... I 1.2 The Concurrency Control Problem for Database Management Systems 4 1.3 The Concurrency Control Problem for Transaction Systems ...... 8 1.4 Related Work ................................................. 12 PKRT :I THE CONCURRENCY CONTROL PROBLEM FOR DATABASE MANAGEMENT SYSTEMS 8 I 2. DATABASE SYSTEMS .................................................. 19 2.1 Database Systems without Restarts ............................ 20 2.2 Database Systems with Restarts ............................... 27 2.3 Assessing the Models ......................................... 29 2.4 Correctness Criteria for Database Systems .................... 29 3. GENERAL PURPOSE SCHEDULERS ........................................ 33 3.1 Characterization of General Purpose Schedulers ............... 33 3.2 Herbrand Interpretations ..................................... 36 4. 37 LOGS . .............. ,°,., .............. °,.., .................. ° .... 4.1 Basic Definitions ............................................ 37 4.2 Relating Logs and Computations ............................... 39 lV .5 CORRECTNESS CRITERIA FOR GENERAL PURPOSE SCHEDULERS .............. 41 5.1 Weak Serializability ........................................ 42 5.2 Variations of Weak Serializability .......................... 49 5.3 Conflict Preserving Serializability ......................... 53 5.4 Extending the Results to Database Systems with Restarts ..... 56 6. CONSTRUCTING GENERAL PURPOSE SCHEDULERS .......................... 57 58 6.1 Basic Design Decisions ...................................... 6.2 Schedulers for Systems without Restarts ..................... 60 6.3 Schedulers for Systems with Restarts ........................ 64 7. CONFLICT-PRESERVING SCHEDULERS ................................... 6q 7.1 Basic CPSR Strategy ......................................... 68 7.2 Refined CPSR Strategy ....................................... 70 7.3 An Aggressive CPSR Scheduler ................................ 78 PART II: THE CONCURRENCY CONTROL PROBL~ FOR TRANSACTION SYSTEMS .... 83 8. DATABASE DESCRIPTION ............................................. 84 8.1 Many-Sorted Languages ....................................... 87 8.2 Special Many-Sorted Languages ............................... 89 8.3 Special Many-Sorted Theories ................................ 92 8.4 Relational Databases ........................................ 95 8.5 Examples ................................. ~ .................. 9~ Vll 9. DATABASE MANIPULATION .......................................... 98 9.1 Many-Sorted Regular Programs .............................. I OO 9.2 Many-Sorted Concurrent Programs ........................... 104 i0. CONCURRENT DYNAMIC LOGIC ....................................... 109 I0.1 Formal System ............................................. 110 i0.2 Examples .................................................. 115 ii. CORRECTNESS OF TRANSACTION SYSTEMS ............................. 119 ii.i Correctness Criteria for Transaction Systems .............. 120 11.2 Proving Relative Serializability by Conflict Analysis ..... 127 ii. 3 Extending Conflict Analysis ............................... 142 11.4 Proving Relative Serializability for a Special Class of Transaction Systems .............. ........... . ............. 148 11.5 Proving Consistency Preservation .......................... 162 11.6 Comments .................................................. 164 12. CONCLUSION AND DIRECTIONS FOR FUTURE WORK ...................... 167 REFERENCES ..................................................... 170 .i INTRODUCTION I.i Statement of the Problem A database models some enterprise using a set of data structures abstracting the relevant objects of that enterprise and a set of consistency criteria describing the interconnections between the objects. Data in the database adequately represents a state of the enterprise if it satisfies all correctness criteria. A user's program that records a change of state of the enterprise must therefore modify data in such a way as to preserve consistency. Such programs are called transactions [ES]. Transactions access a database through calls to a database management system (DBMS). A set of transactions modelling a database application and the corresponding DBMS form a database system. If several transactions concurrently access a database, synchronization anomalies may arise. Anomalies may lead, for example, to the loss of the updates made by a transaction due to race conditions or to a final database state containing inconsistent data; the absence of each of these conditions is called absence of lost updates and consistency preservation, respectively, we call the problem of avoiding synchronization anomalies the concurrency control problem for database system, whose solution requires some strategy to arbitrate access to data. The concurrency control problem for database systems rises peculiar difficulties as compared to the similar problem found in operating systems. To begin with, the set of resources to control is potentially enormous, since any subset of the database can act as a resource. Moreover, each resource is usually accessed associatively, rather than by a name [SC]. The concurrency control problem for database systems also differs from classical job-shop scheduling [CO2]. The set of tasks (transactions) to be scheduled may not be known in advance, but rather becomes known as the scheduling evolves. Although we may consider that tasks are "ordered" by the conflicts they generate, the ordering is also not known a priori, since conflicts usually depend on the scheduling itself. These characteristics make the concurrency control problem for databases so much harder that we can hope to avoid only the synchronization anomalies, abstaining from the optimization aspects raised in classical job-shop scheduling. The solutions to the concurrency control problem we discuss depend primarily on three factors: what anomalies must be avoided; who is responsible for concurrency control - the DBMS or the user; and how much information about transactions is used to solve the problem. To a large extent, we ignore other very important parameters such as the type of database in question (distributed or centralized) and the performance criteria adopted (e.g., response time tradeoffs between synchronization overhead and concurrency level). We create tworather distinct scenarios by assuming different positions with respect to who is responsible for concurrency control and how much information about transactions is used. However, the anomalies to be avoided - lost updates and loss of consistency - remain basically the same throughout the discussion, thus providing a sense of continuity when we move from one scenario to the other. In Part I (Chapters 2 to 7), we assume that the DBMS is responsible for concurrency control and that it must support any transactionm ix accessing any database. The only information known to the DBMS is the set of data items a transaction reads or writes. In addition, we assume that the DBMS acquires such information only when it processes read or write operations, unless the transaction predeclares the set of data items it will access. The motivation for this scenario comes from general purpose DBMSs, such as System R [ASI], that do not depend on the application in question. Solutions to this problem are corm, only described via algorithms, called here generalp urpose schedulers, designed to intercept all access requests or synchronization calls (such as lock requests), and to detect and avoid the synchronization anomalies. These schedulers have to be efficient, since they operate on-line. Moreover, their synchronization strategies must be based solely on the past history of accesses, by the assumption on transactions, and on the predeclared accesses, if they exist. In Part II (Chapters 8 to Ii), we concentrate on the opposite scenario. We assume that the user is responsible for concurrency control, the set of transactions and the database are fixed and any fact about the transactions or the database can be used by the synchronization mechanism. These assumptions are biased towards the study of update-intensive transaction-systems, suck as airline and hotel reservation, point-of-sale inventory control and electronic funds transfer [BE3, whose transactions are expected to interfere heavily with each other and yet a fast response time is required. Evidently, the schedulers developed in Part I also solve the present problem. However, their performance can be improved to achieve better response time by an analysis of what transactions compute and how they interact to discover what anomalies really arise. Special synchronization strategies can then be devised from such an analysis. As a consequence, it does not make sense to seek general solutions in this scenario. We will then present a set of tools and techniques whereby transaction systems and their correctness criteria can be described and, more important, verified. By doing so, we concentrate mostly on

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.