ebook img

Professional XML Databases PDF

840 Pages·2000·6.99 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Professional XML Databases

Professional XML Databases KevinWilliams MichaelBrundage Patrick Dengler JeffGabriel Andy Hoskinson MichaelKay ThomasMaxwell Marcelo Ochoa Johnny Papa MohanVanmane Wrox Press Ltd. Professional XML Databases © 2000 Wrox Press All rights reserved. No part of this book may be reproduced, stored in a retrieval system or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embodied in critical articles or reviews. The author and publisher have made every effort in the preparation of this book to ensure the accuracy of the information. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, Wrox Press nor its dealers or distributors will be held liable for any damages caused or alleged to be caused either directly or indirectly by this book. Published by Wrox Press Ltd, Arden House, 1102 Warwick Road, Acocks Green, Birmingham, B27 6BH, UK Printed in the Canada ISBN 1861003587 Trademark Acknowledgements Wrox has endeavored to provide trademark information about all the companies and products mentioned in this book by the appropriate use of capitals. However, Wrox cannot guarantee the accuracy of this information. Credits Authors Technical Architect Kevin Williams Jon Duckett Michael Brundage, Technical Editors Patrick Dengler Chris Mills Jeff Gabriel, Andrew Polshaw Andy Hoskinson, Lisa Stephenson Michael Kay Thomas Maxwell, Category Manager Marcelo Ochoa Dave Galloway Johnny Papa, Mohan Vanmane Author Agent Tony Berry Technical Reviewers Project Manager Danny Ayers Avril Corbin DavidBaliles Cary Beuershausen Production Manager Matt Birbeck Simon Hardware Maxime Bombadier Bob Cherinka Production Project Coordinator Michael Corning Mark Burdett Jeremy Crosbie Dino Esposito Indexing Nazir Faisal Alessandro Ansa Sam Ferguson Constantinos Hadjisotiriou Figures Scott Haley Shabnam Hussain Alex Homer Michael Kay Cover Jim Macintosh Shelley Frazier Craig McQueen Thomas B. Passin Proof Readers David Schult Diana Skeldon Marc H. Simkin Agnes Wiggers Dave Sussman Dorai Thodla Beverley Treadwell Warren Wiltsie About the Authors Kevin Williams Kevin's first experience with computers was at the age of 10 (in 1980) when he took a BASIC class at a local community college on their PDP-9, and by the time he was 12, he stayed up for four days straight hand-assembling 6502 code on his Atari 400. His professional career has been focussed on Windows development – first client-server, then onto Internet work. He's done a little bit of everything, from VB to Powerbuilder to Delphi to C/C++ to MASM to ISAPI, CGI, ASP, HTML, XML, and any other acronym you might care to name; but these days, he's focusing on XML work. Kevin is currently working with the Mortgage Bankers' Association of America to help them put together an XML standard for the mortgage industry. Michael Brundage Michael Brundage works as a software developer on Microsoft's WebData Internet team, where he develops XML features for SQL Server 2000. Michael participates actively in the design of the XML Query Language, producing Microsoft's prototype for the W3C Working Group. Before Microsoft, Michael was the Senior Software Engineer for NASA's Interferometry Science Center at Caltech, where he developed networked collaborative environments and a simulation of radiative transfer. Michael would like to thank his wife Yvonne for her patience; Dave Van Buren, friend and mentor, for starting it all; Microsoft for allowing him to write; Chris Suver and Paul Cotton for reviewing early drafts; and everyone at Wrox Press for their help, humor, and flexibility. Patrick Dengler Patrick is busily growing Internet startups throughout the "Silicon Forest" area. His interests include building companies by creating frameworks for Internet architectures. He has received several patents in stateless Internet database architectures. I want to thank my lovely, graceful and beautiful wife Kelly for simply putting up with me. Without her and my family, Devin, Casey, and Dexter, I wouldn't be whole. Jeff Gabriel Jeff Gabriel currently works as a developer for eNationwide, the e-commerce arm of Nationwide Insurance Systems. Jeff is an MCSE, and was formerly a Webmaster before finding the call to be true geek too strong. He enjoys spending time with his wife Meredith and two children; Max and Lily. He also likes to read books about technology and computers when not working on same." Thanks to my family for understanding the long hours it took to write for this book, and my great desire to do it. I also thank God, who has answered my prayers with many great opportunities. Finally, thanks to the guys at ATGI Inc. Thanks to Matt for your excellent direction and support over the years, and to Jason, an incomparable source for all things Java. Andy Hoskinson Andy Hoskinson is a senior technical director for a leading Internet professional services firm. He develops enterprise-class Internet solutions using a variety of technologies, including Java and XML. Andy is a co-author of Professional Java Server Programming, J2EE Edition (Wrox Press, Sept. 2000). He is also a co-author of Microsoft Commerce Solutions (Microsoft Press, April 1999), and has contributed to several different technical publications, including Active Server Developer's Journal and Visual J++ Developer's Journal. Andy is a Sun Certified Java Programmer and Microsoft Certified Solution Developer, and lives in Northern Virginia with his wife Angie. Andy can be reached at [email protected]. Michael Kay Michael Kay has spent most of his career as a software designer and systems architect with ICL, the IT services supplier. As an ICL Fellow, he divides his time between external activities and mainstream projects for clients, mainly in the area of electronic commerce and publishing. His background is in database technology: he has worked on the design of network, relational, and object-oriented database software products – as well as a text search engine. In the XML world he is known as the developer of the open source Saxon product, the first fully-conformant implementation of the XSLT standard. Michael lives in Reading, Berkshire with his wife and daughter. His hobbies include genealogy and choral singing. Thomas Maxwell Thomas Maxwell has worked the last few years for eNationwide, the Internet arm of one of the world's largest insurance companies, developing advanced internet/intranet applications – Many of which utilized XML databases. He also continues to work with his wife Rene to develop cutting edge Internet applications, such as the XML based Squirrel Tech Engine, for Creative Squirrel Solutions – a technical project implementation firm. Tom's technical repertoire includes such tools as Visual Basic, ASP, COM+, Windows DNA and of course XML. Tom can be reached at [email protected] During the writing of this book I became the proud father of my wife's and my first child. So I would like to thank, firstly my wife for being understanding of my desire to meet the book's deadlines. And secondly to the staff of Wrox for understanding that a new baby sometimes makes it difficult to meet deadlines. I would also like to thank the understanding people who helped with the non-book things that allowed me the time to contribute to this book, including Tom Holquist, who understands why one may be a little late to the office once in a while and my family including Marlene and Sharon for helping with Gabrielle in the first few weeks. Marcelo Ochoa Marcelo Ochoa works at the System Laboratory of Facultad de Ciencias Exactas, of the Universidad Nacional del Centro de la Provincia de Buenos Aires and as an external consultant and trainer for Oracle Argentina. He divides his time between University jobs and external projects related to Oracle web technologies. He has worked in several Oracle related projects like translation of Oracle manuals and multimedia CBTs. His background is in database, network, Web and Java technologies. In the XML world he is known as the developer of the DB Producer for the Apache Cocoon project, the framework that permits generate XML in the database side. Summary of Contents Introduction 1 Chapter 1: XML Design for Data 11 Chapter 2: XML Structures for Existing Databases 47 Chapter 3: Database Structures for Existing XML 67 Chapter 4: Standards Design 111 Chapter 5: XML Schemas 143 Chapter 6: DOM 191 Chapter 7: SAX – The Simple API for XML 241 Chapter 8: XSLT and XPath 285 Chapter 9: Relational References with XLink 347 Chapter 10:Other Technologies (XBase, XPointer, XInclude, XHTML, XForms) 375 Chapter 11:The XML Query Language 409 Chapter 12:Flat Files 431 Chapter 13:ADO, ADO+, and XML 481 Chapter 14:Storing and Retrieving XML in SQL Server 2000 533 Chapter 15:XML Views in SQL Server 2000 581 Chapter 16:JDBC 627 Chapter 17:Data Warehousing, Archival, and Repositories 669 Chapter 18:Data Transmission 701 Chapter 19:Marshalling and Presentation 723 Chapter 20:SQL Server 2000 XML Sample Applications 763 Chapter 21:DB Prism: A Framework to Generate Dynamic XML from a Database 807 Appendix A:XML Primer 863 Appendix B:Relational Database Primer 901 Appendix C:XML Schema Datatypes 915 Appendix D:SAX 2.0: The Simple API for XML 929 Appendix E:Setting Up a Virtual Directory for SQL Server 2000 975 Appendix F:Support, Errata and P2P.Wrox.Com 985 Index 991 Table of Contents Introduction 1 Why XML and Databases 2 What This Book is About 3 Who Should Use This Book? 3 Data Analysts 3 Relational Database Developers 3 XML Developers 4 Understanding the Problems We Face 4 Structure of the Book 4 Design Techniques 5 Technologies 5 Data Access 6 Common Tasks 6 Case Studies 7 Appendices 7 Technologies Used in the Book 7 Conventions 8 Customer Support 8 Source Code and Updates 8 Errata 9 Chapter 1: XML Design for Data 11 XML for Text Versus XML for Data 12 XML for Text 12 XML for Data 12 Representing Data in XML 14 Element Content Models 14 Element-only Content 14 Mixed Content 14 Text-only Content 15 EMPTY Content 15 ANY Content 16 Using Attributes 16 Other Considerations 17 Audience 17 Performance 17 Data Modeling Versus Representation Modeling 18 XML Data Structures – A Summary 20 Table of Contents Mapping Between RDBMS and XML Structures 20 Structure 21 Elements 21 Attributes 22 Data Points 22 Our Invoice Using Elements 24 Our Invoice Using Attributes 25 Comparing the Two Approaches 26 Elements or Attributes – The Conclusion 30 Relationships 30 Containment 31 More Complex Relationships – Pointers 32 More Complex Relationships – Containment 34 Relationships – Conclusion 36 Sample Modeling Exercise 37 Before We Begin 37 What is the Scope of the Document? 38 Which Structures Are We Modeling? 38 What Are the Relationships Between Entities? 39 Which Data Points Need to be Associated with Each Structure? 39 Creating the XML DTD 40 Start With the Structures 41 Add the Data Points to the Elements 41 Incorporate the Relationships 41 Sample XML Documents 43 Summary 45 Chapter 2: XML Structures for Existing Databases 47 Migrating a Database to XML 48 Scoping the XML Document 49 Creating the Root Element 51 Model the Tables 51 Model the Nonforeign Key Columns 52 Adding ID Attributes 54 Handling Foreign Keys 55 Add Enumerated Attributes for Lookup Tables 55 Add Element Content to the Root Element 57 Walk the Relationships 58 Add Missing Elements to the Root Element 60 Discard Unreferenced ID attributes 61 An Example XML Document 62 Summary 63 Chapter 3: Database Structures for Existing XML 67 How to Handle the Various DTD Declarations 68 Element Declarations 68 The Element-only (Structured Content) Model 68 The Text-only Content Model 75 The EMPTY Content Model 78 The Mixed Content Model 80 The ANY Content Model 83 ii Table of Contents Attribute List Declarations 84 CDATA 84 Enumerated Lists 85 ID and IDREF 87 IDREFS 91 NMTOKEN and NMTOKENS 94 ENTITY and ENTITIES 95 Entity Declarations 96 Notation Declarations 96 Avoid Name Collisions! 96 Summary 97 Example 97 Modeling the Attributes 102 Summary 107 The Rules 107 Chapter 4: Standards Design 111 Scoping the Solution 111 Types of Standards 112 System Internal Standards 112 Cross-system Standards 112 Industry-level Standards 112 Document Usage 113 Archival Documents 113 Transactional Data Documents 115 Presentation Layer Documents 117 Before Diving In: Ground Rules 120 Implementation Assumptions 121 Elements vs. Attributes 121 Restricting Element Content 122 Don't Allow the ANY Element Type 122 Don't Allow the Mixed-content Element Type 123 Constrain Elements that have Structured Content 123 Capturing Strong Typing Information 124 Naming Conventions 125 Understanding the Impact of Design Decisions 126 Performance 126 Document Size 126 Overnormalization 127 Too Many Pointing Relationships 127 Coding Time 129 Document Complexity 130 Pointing Relationships 130 Levels of Abstraction 130 Developer Ramp-up Time 131 Extensibility 131 During the Development 131 Subdividing the Workload 131 Data issues 132 General vs. Specific 132 Required vs. Optional 133 "Tag soup" 133 Keeping the Structure Representation-Independent 133 iii

Description:
In addition to being a tutorial for learning how to use XML as an effective way to represent and transmit data across the Web, Professional XML Databases also covers how to work with XML in the current generation of Microsoft tools, like Internet Explorer and SQL Server 2000. For any developer or ma
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.