ebook img

Database management: a systems approach using Java [course notes] PDF

779 Pages·03.764 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Database management: a systems approach using Java [course notes]

DATABASE MANAGEMENT: A SYSTEMS APPROACH USING JAVA Edward Sciore Boston College 2007 COURSE NOTES FOR CS357 AT BOSTON COLLEGE DO NOT REPRINT WITHOUT PERMISSION iii CONTENTS 1. Introduction: Why a Database System? ..........................................................................1 1.1 Databases and Database Systems 1.2 Record Storage 1.3 Multi-User Access 1.4 Memory Management 1.5 Data Models and Schemas 1.6 Physical Data Independence 1.7 Logical Data Independence 1.8 Chapter Summary 1.9 Suggested Reading 1.10 Exercises PART 1: Relational Databases .........................................................................................15 2. Data Definition..............................................................................................................17 2.1 Tables 2.2 Null Values 2.3 Keys 2.4 Foreign Keys and Referential Integrity 2.5 Integrity Constraints 2.6 Specifying Tables in SQL 2.7 Chapter Summary 2.8 Suggested Reading 2.9 Exercises 3. Data Design ...................................................................................................................31 3.1 Designing Tables is Difficult 3.2 Class Diagrams 3.3 Transforming Class Diagrams to Tables 3.4 The Design Process 3.5 Relationships as Constraints 3.6 Functional Dependencies and Normalization 3.7 Chapter Summary 3.8 Suggested Reading 3.9 Exercises iv 4. Data Manipulation ........................................................................................................73 4.1 Queries 4.2 Relational Algebra 4.3 SQL Queries 4.4 SQL Updates 4.5 Views 4.6 Chapter Summary 4.7 Suggested Reading 4.8 Exercises 5. Integrity and Security ..................................................................................................125 5.1 The Need for Integrity and Security 5.2 Assertions 5.3 Triggers 5.4 Authorization 5.5 Mandatory Access Control 5.6 Chapter Summary 5.7 Suggested Reading 5.8 Exercises 6. Improving Query Efficiency .......................................................................................143 6.1 The Virtues of Controlled Redundancy 6.2 Materialized Views 6.3 Indexes 6.4 Chapter Summary 6.5 Suggested Reading 6.6 Exercises PART 2: Building Database Applications .....................................................................169 7. Clients and Servers ....................................................................................................171 7.1 The Data-Sharing Problem 7.2 Database Clients and Servers 7.3 The Derby and SimpleDB Database Servers 7.4 Running Database Clients 7.5 The Derby ij Client 7.6 The SimpleDB Version of SQL 7.7 Chapter Summary 7.8 Suggested Reading 7.9 Exercises 8. Using JDBC ...............................................................................................................185 8.1 Basic JDBC 8.2 Advanced JDBC v 8.3 Computing in Java vs. SQL 8.4 Chapter Summary 8.5 Suggested Reading 8.6 Exercises 9. Persistent Java Objects ...............................................................................................221 9.1 The Domain Model and View of a Client Program 9.2 The Problem with Our JDBC Domain Model 9.3 The Java Persistence Architecture 9.4 The Java Persistence Query Language 9.5 Downloading a JPA Implementation 9.6 Chapter Summary 9.7 Suggested Reading 9.8 Exercises 10. Data Exchange .........................................................................................................265 10.1 Sharing Database Data with Non-Users 10.2 Saving Data in an XML Document 10.3 Restructuring an XML Document 10.4 Generating XML Data from the Database 10.5 Chapter Summary 10.6 Suggested Reading 10.7 Exercises 11. Webserver-Based Database Clients ..........................................................................293 11.1 Types of Database Clients 11.2 Interacting with a Web Server 11.3 Basic Servlet Programming 11.4 Managing Database Connections 11.5 Configuring a Servlet Container 11.6 Chapter Summary 11.7 Suggested Reading 11.8 Exercises PART 3: Inside the Database Server ..............................................................................317 12. Disk and File Management .......................................................................................321 12.1 Persistent Data Storage 12.2 The Block-Level Interface to the Disk 12.3 The File-Level Interface to the Disk 12.4 The Database System and the OS 12.5 The SimpleDB File Manager 12.6 Chapter Summary 12.7 Suggested Reading 12.8 Exercises vi 13. Memory Management ...............................................................................................355 13.1 Two Principles of Database Memory Management 13.2 Managing Log Information 13.3 The SimpleDB Log Manager 13.4 Managing User Data 13.5 The SimpleDB Buffer Manager 13.6 Chapter Summary 13.7 Suggested Reading 13.8 Exercises 14. Transaction Management ..........................................................................................385 14.1 Transactions 14.2 Using Transactions in SimpleDB 14.3 Recovery Management 14.4 Concurrency Management 14.5 Implementing SimpleDB Transactions 14.6 Testing Transactions 14.7 Chapter Summary 14.8 Suggested Reading 14.9 Exercises 15. Record Management .................................................................................................443 15.1 The Record Manager 15.2 Implementing a File of Records 15.3 The SimpleDB Record Manager 15.4 Chapter Summary 15.5 Suggested Reading 15.6 Exercises 16. Metadata Management ...............................................................................................477 16.1 The Metadata Manager 16.2 Table Metadata 16.3 View Metadata 16.4 Statistical Metadata 16.5 Index Metadata 16.6 Implementing the Metadata Manager 16.7 Chapter Summary 16.8 Suggested Reading 16.9 Exercises 17. Query Processing ......................................................................................................499 17.1 Scans 17.2 Update Scans 17.3 Implementing Scans 17.4 Pipelined Query Processing vii 17.5 The Cost of Evaluating a Scan 17.6 Plans 17.7 Predicates 17.8 Chapter Summary 17.9 Suggested Reading 17.10 Exercises 18. Parsing........................................................................................................................539 18.1 Syntax vs. Semantics 18.2 Lexical Analysis 18.3 Implementing the Lexical Analyzer 18.4 Grammars 18.5 Recursive-Descent Parsers 18.6 Adding Actions to the Parser 18.7 Chapter Summary 18.8 Suggested Reading 18.9 Exercises 19. Planning .....................................................................................................................567 19.1 The SimpleDB Planner 19.2 Verification 19.3 Query Planning 19.4 Update Planning 19.5 Implementing the SimpleDB Planner 19.6 Chapter Summary 19.7 Suggested Reading 19.8 Exercises 20. The Database Server .................................................................................................585 20.1 Server Databases vs. Embedded Databases 20.1 Client-Server Communication 20.3 Implementing the Remote Interfaces 20.4 Implementing the JDBC Interfaces 20.5 Chapter Summary 20.6 Suggested Reading 20.7 Exercises PART 4: Efficient Query Processing .............................................................................603 21. Indexing ....................................................................................................................605 21.1 The Index Interface 21.2 Static Hash Indexes 21.3 Extendable Hash Indexes 21.4 B-Tree Indexes 21.5 Index-Aware Operator Implementations viii 21.6 Index Update Planning 21.7 Chapter Summary 21.8 Suggested Reading 21.9 Exercises 22. Materialization and Sorting.......................................................................................653 22.1 The Value of Materialization 22.2 Temporary Tables 22.3 Materialization 22.4 Sorting 22.5 Grouping and Aggregation 22.6 Merge Joins 22.7 Chapter Summary 22.8 Suggested Reading 22.9 Exercises 23. Effective Buffer Utilization ......................................................................................691 23.1 Buffer Usage in Query Plans 23.2 MultiBuffer Sorting 23.3 MultiBuffer Product 23.4 Implementing the Multibuffer Operations 23.5 Hash Joins 23.6 Comparing the Join Algorithms 23.7 Chapter Summary 23.8 Suggested Reading 23.9 Exercises 24. Query Optimization ..................................................................................................715 24.1 Equivalent Query Trees 24.2 The Need for Query Optimization 24.3 The Structure of a Query Optimizer 24.4 Finding the Most Promising Query Tree 24.5 Finding the Most Efficient Plan 24.6 Combining the Two Stages of Optimization 24.7 Merging Query Blocks 24.8 Chapter Summary 24.9 Suggested Reading 24.10 Exercises References ........................................................................................................................761 1 1 INTRODUCTION: WHY A DATABASE SYSTEM? Database systems are a big deal in the computer industry. Some database systems (such as Oracle) are enormously complex, and typically run on large, high-end machines. Most database systems have a special user, called the database administrator (or DBA), who is responsible for its smooth operation. The DBA in a large database installation is very often a dedicated full-time employee; some installations even have several full-time administrators. This chapter examines the features a database system is expected to have, to better understand why it takes so many resources to implement these features. 1.1 Databases and Database Systems A database is a collection of data stored on a computer. The data in a database is typically organized into records, such as employee records, medical records, sales records, etc. Figure 1-1 depicts a database that holds information about students in a university and the courses they have taken. This database will be used as a running example throughout the book. The database of Figure 1-1 contains five types of records: There is a STUDENT record for each student that has attended the university. Each record contains the student‟s ID number, name, graduation year, and ID of the student‟s major department. There is a DEPT record for each department in the university. Each record contains the department‟s ID number and name. There is a COURSE record for each course offered by the university. Each record contains the course‟s ID number, title, and the ID of the department that offers it. There is a SECTION record for each section of a course that has ever been given. Each record contains the section‟s ID number, the year the section was offered, the ID of the course, and the professor teaching that section. There is an ENROLL record for each course taken by a student. Each record contains the enrollment ID number, the ID numbers of the student and the section of the course taken, and the grade the student received for the course. 2 STUDENT SId SName GradYear MajorId 1 joe 2004 10 2 amy 2004 20 3 max 2005 10 4 sue 2005 20 5 bob 2003 30 6 kim 2001 20 7 art 2004 30 8 pat 2001 20 9 lee 2004 10 COURSE CId Title DeptId DEPT DId DName 10 compsci 12 db systems 10 20 math 22 compilers 10 30 drama 32 calculus 20 42 algebra 20 52 acting 30 62 elocution 30 SECTION SectId CourseId Prof YearOffered 13 12 turing 2004 23 12 turing 2005 33 32 newton 2000 43 32 einstein 2001 53 62 brando 2001 ENROLL EId StudentId SectionId Grade 14 1 13 A 24 1 43 C 34 2 43 B+ 44 4 33 B 54 4 53 A 64 6 53 A Figure 1-1: Some records for a university database This database uses several small records to store information about one real-world object. For example, not only does each student in the university have a STUDENT record, but the student also has an ENROLL record for each course taken. Similarly, each course has a COURSE record and several SECTION records – there will be one SECTION record for each time the course has been offered. There are, of course, many other ways to structure the same information into records. For example, we could put all information

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.