ebook img

Using XML PDF

652 Pages·2000·5.671 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Using XML

Special Edition Using XML by Lee Anne Phillips ISBN: 0789719967 Q ue © 2000, 879 pages A must-have tutorial to XML documents and applications. Synopsis This Special Edition guide steps the reader from the basic concepts behind creating XML documents and applications through to technically sophisticated concepts and projects. The author knows the ins and outs of XML and can make informed comparisons to HTML, as she is also the author of Practical HTML. This Special Edition guide is expertly written and is one that will be referred to time and time again by anyone working with XML coding. Table of Contents Special Edition Using XML - 3 Introduction - 5 Part I XML Fundamentals Chapter 1 - Introduction to XML - 8 Chapter 2 - Understanding XML Syntax - 21 Chapter 3 - Understanding XML Document Type Definitions - 42 Chapter 4 Extending a Document Type Definition with Loca - Modifications - 66 Chapter 5 - Building a DTD from Scratch - 80 Chapter 6 - XML Namespaces - 93 Chapter 7 - XML Schemas - 103 Chapter 8 - XPath - 125 Chapter 9 - XLink and XPointer - 134 Part II Manipulating XML Chapter 10 - DOM—The Document Object Model - 153 Chapter 11 - SAX—The Simple API for XML - 166 Chapter 12 - CSS1, CSS2, DSSSL, XSL - 193 Chapter 13 - Cascading Style Sheets and XML/XHTML - 222 Chapter 14 - Using Next-Generation Extensible Style Sheets—XSL - 240 Part III Intergrating XML with Other Technologies Chapter 15 - Getting Information About the User - 267 Chapter 16 - Security and Privacy - 277 - 2 - Chapter 17 - Using Server-Side Facilties—Java - 301 Chapter 18 - Using Server-Side Facilities—ActiveX - 324 Chapter 19 - Using Common Object Brokers - 338 Chapter 20 - Enterprise Resource Planning and XML - 357 Part IV Other Applications of XML Chapter 21 - RDF—The Resource Description Framework - 365 Chapter 22 - XHTML in Detail - 382 Chapter 23 - Using SMIL—The Synchronized Multimedia Markup Language - 397 Chapter 24 - Using MathML—The Math Markup Language - 428 Chapter 25 - Accessibility - 468 Chapter 26 - Industry Cooperation and Resources - 493 Chapter 27 - Summing Up—The Future of the Web - 504 Appendix A - XML/XHTML Reference - 514 Appendix B - Tools for XML/XHTML Editing and Conversion - 569 Appendix C - Contents of the CD - 574 Appendix D - Character Entities - 581 Appendix E - CSS1, CSS2 Reference - 605 Terms Used in This Book - 637 Index List of Figures List of Tables List of Sidebars Back Cover Special Edition Using XML is a comprehensive tutorial/reference that shows you how to create XML documents and applications. This book begins with basic concepts and progresses through the creation of sophisticated XML projects. Comprehensive tutorial/reference that covers creating XML documents and applications Extremely practical, hands-on book that contains concepts, tools, specifications, and applications for XML About the Author Lee Anne Phillips has been involved in data processing and networking since that happy day the professor of her first programming class at UC Berkeley persuaded her that she had a talent for this stuff and should rethink her planned career in linguistic psychology and tweak bits instead. From UCB she traveled through various incarnations as a mainframe systems programmer, firmware designer, engineer, software architect, programming and project manager, and finally consultant until she just had to tell somebody about it. She resides in the San Francisco Bay Area, her place of birth, holds bachelor's degree in computer science, and has a great sense of humor. She c an be found at http://www.leeanne.com/. She wrote Practical HTML 4 (1999). Special Edition Using XML - 3 - Bestseller Edition Lee Anne Phillips Associate Publisher Tracy Dunkelberger Acquisitions Editor Todd Green Development Editors Sean DixonJeff Durham Managing Editor Thomas F. Hayes Project Editor Leah Kirkpatrick Copy Editor Julie McNamee Indexer Kelly Castell Proofreader Harvey Stanbrough Technical Editors Marshall Jansen Jay Aguilar Karl Fast Benoit Marchel Dallas Releford Team Coordinator Cindy Teeters Media Developer Craig Atkins Interior Designer Ruth Harvey Cover Designers Dan Armstrong Ruth Harvey Production Darin Crone Steve Geiselman Special Edition Using XML Copyright© August 2000 by Que All rights reserved. No part of this book shall be reproduced, stored in a retrieval system, or transmitted by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission from the publisher. No patent liability is assumed with respect to the use of the information contained herein. Although every precaution has been taken in the preparation of this book, the publisher and author assume no responsibility for errors or omissions. Nor is any liability assumed for damages resulting from the use of the information contained herein. International Standard Book Number: 0-7897-1996-7 Library of Congress Catalog Card Number: 99-60197 02 01 00 4 3 2 1 Trademarks All terms mentioned in this book that are known to be trademarks or service marks have been appropriately capitalized. Que cannot attest to the accuracy of this information. Use of a term in this book should not be regarded as affecting the validity of any trademark or service mark. Warning and Disclaimer Every effort has been made to make this book as complete and as accurate as possible, but no warranty or fitness is implied. The information provided is on an “as is” basis. The author(s) and the publisher shall have neither liability nor - 4 - About the Author Lee Anne Phillips has been involved in data processing and networking since that happy day the professor of her first programming class at UC Berkeley persuaded her that she had a talent for this stuff and should rethink her planned career in linguistic psychology and tweak bits instead. From UCB she traveled through various incarnations as a mainframe systems programmer, firmware designer, network engineer, software architect, programming and project manager, and finally consultant until she just had to tell somebody about it. She resides in the San Francisco Bay Area—her place of birth—has a bachelor’s degree in computer science, and has a great sense of humor. Visit her Web page at http://www.leeanne.com/. Dedication To Alison Eve Ulman, dearest rose among the thorns. Your desperate and transgressive art has inspired my own with clearer vision and profoundest passion. And to Dangerous Downtown Dave, sweet companion of many years, whose adventurous life took him from the bitter snows of Salt Lake City, Utah, through the empty sunshine of Southern California and into the welcoming fog and rain of the San Francisco Bay Area. He was a rowdy brawler to the end but a gentle and loving friend to his surrogate Mom. Acknowledgments My grateful appreciation to my initial editor at Que, Todd Green, for many patient hours of effort persuading me to rough out the evolving outline and then refine it into a plan. This, in spite of my perfectly natural inclination to revise and extend as I go along, causing stalwart editors to tear out their hair and the noble production crew to weep with frustration. My thanks especially to Jeff Durham, my development editor, for his helpful feedback and advice; to Jeremy H. Griffith, my technical editor, for catching many fuzzy lapses from clarity and a few outright mistakes; to the countless editors and staff at Que whose many hands and minds have helped turn my words into printable form. If you’ll glance at the back of the title page, you’ll see a list of some few of their names, and each of them has my undying gratitude. My gratitude also to the typographers, graphic artists, printers, bookbinders, and all the rest who actually produced the physical book you hold in your hands. And to the delivery people and booksellers of every description whose faith in it has led it to the rack or Web page you found it on. And of course you, Dear Reader, who might be thinking of buying this book right this minute, I hope! So thank you all, readers, editors, bookstores, printers and everyone. This is the work of many hands and is now in yours. Tell Us What You Think! As the reader of this book, you are our most important critic and commentator. We value your opinion and want to know what we’re doing right, what we could do better, what areas you’d like to see us publish in, and any other words of wisdom you’re willing to pass our way. As an Associate Publisher for Que, I welcome your comments. You can fax, email, or write me directly to let me know what you did or didn’t like about this book—as well as what we can do to make our books stronger. Please note that I cannot help you with technical problems related to the topic of this book, and that due to the high volume of mail I receive, I might not be able to reply to every message. When you write, please be sure to include this book’s title and author as well as your name and phone or fax number. I will carefully review your comments and share them with the author and editors who worked on the book. Fax: 317-581-4666 [email protected] Email: Mail: Tracy Dunkelberger Que 201 West 103rd Street Indianapolis, IN 46290 USA Introduction This book is dedicated to making sense out of the raft of competing (and sometimes conflicting) XML- related standards and proposals afloat on the great sea of XML possibilities. Many of the facilities most - 5 - needed by users (that’s you, Dear Reader) are supplied by means of a half dozen or more differing “standards” with varying degrees of support from only a handful of vendors. It’s enough to make you tear out your hair. But the concepts of XML are about as simple as taking up a red pen and making notes on a text as you read. Unlike HTML, which has a very limited vocabulary, XML gives you almost unlimited freedom to describe your documents in any way you choose. XML is fast becoming the lingua franca of the Web, with new vendors climbing onboard the XML bandwagon every day. This book is designed to help you catch a ride with everyone else. Many major corporations with significant resources have already invested in XML, among which are some of the heavy hitters in the industry. Microsoft, IBM, SAP, Netscape, Oracle, Sun Microsystems, the US government, the US military—the list reads like a roll call of the top Fortune 5000 companies and major governmental agencies few of us can afford to ignore, even if we don’t deal with them directly. This is a critical time in the development of XML; standards are being proposed and promulgated faster than any one person can keep track of. There are sometimes proposals from World Wide Web Consortium (W3C) members, alternatives from people with no affiliation at all, user group initiatives, and attempts to reconcile them all. This book will help you sort them all out, tell you where to look for more information to help you keep up with new developments as they occur, and help build a conceptual framework to let you fit new bits of information into an existing structure as you go along. Have fun. Yet everyone online depends on the W3C. Making Web applications that work is hard enough without dealing with a hundred custom systems from different individual manufacturers. Two main approaches are one too many. Like metric and SAE tools, our work-benches are cluttered enough with necessary components without having to worry about whether this particular bolt requires a 19mm or a 3/4 inch wrench. By promulgating standards, W3C ensures that some basic level of interoperability exists among Web applications. Manufacturers who deviate from the standards risk looking foolish in the long run, however they might try to “spin” their decisions or attempts to impose standards of their own on the rest of the industry. So XML and its related standards are all compromises between some Platonic ideal methodology and the gritty business of trying to get on with life on the Web. The various working groups within W3C have taken different tacks in some cases, introducing incompatibilities or inconsistencies into the standards themselves, which might or might not be ironed out in the final analysis. In the meantime, we have to get on with our jobs and make Web applications work. And work they will and do. So let’s get on with it. After we’re up to speed, we’re in for a thrilling ride. Who Should Use This Book This book is designed for professional Web designers, programmers, database and content specialists, and almost anyone involved in publishing or sharing information over any sort of network. XML and its related standards offer the Web community a valuable and flexible way to organize and share data. Potential user communities include (cid:131) Content providers and authors (cid:131) Database users (cid:131) Programmers (cid:131) Web designers (cid:131) Scientists and scholars (cid:131) Researchers and analysts (cid:131) Indexing and search engine providers (cid:131) Anyone who wants to learn more about XML and the future of the Web - 6 - How This Book Is Organized There are six major sections to the book: (cid:131) Part I, “XML Fundamentals,” introduces you to the concepts and facilities needed to use XML and its related standards effectively. (cid:131) Part II, “Manipulating XML,” examines the tools used to control the structure and presentation of the XML document itself, the Document Object Model, Style Sheets, and SAX, the Simple API for XML. (cid:131) Part III, “Integrating XML with Other Technologies,” delves into the database and server-side applications XML is particularly suited for. Real-world examples illustrate the nuts and bolts of applications working in the business environment today. In addition, the important issues of online privacy and security are discussed in the context of current industry practices and problems. (cid:131) Part IV, “Other Applications of XML,” fills in the gaps with discussions of how to integrate existing Web documents with XML using XHTML, ways to create standard descriptive vocabularies using the Resource Description Framework, and multimedia and scientific languages that show how XML or XML-related standards might simplify existing tasks or make possible new interactions as yet undreamed of. Finally, you’ll explore how XML can make the Web more accessible for everyone and look at where we might be heading with Web development in the near future. (cid:131) Appendices and the special reference pages on the inside front and back covers offer a handy quick-reference to the basics, stripped down to their most essential parts for a quick reminder of how to do a given task or what this or that feature looks like. (cid:131) And finally, the bound-in CD-ROM has a selection of the most valuable references and tools for XML production either as freeware, shareware, or demoware together with all the code from the book and valuable reference documents, such as a Web color chart, the XHTML named entities, and more. Conventions Used in This Book This book uses special conventions to help you get the most from this book and from XML. Text Conventions Various typefaces in this book identify terms and other special objects. These special typefaces include the following: Type Meaning Italic New terms or phrases when initially defined. An italic term followed by a page number indicates the page where that term is first defined. Monospace Code, output, and Web addresses appear in a computer type font indicating these are all things that will either be typed into or appear on the computer screen. Bold Specific input the user is supposed to type is in a bold computer type font. Monospace Special Elements Tip Tips are designed to point out features, annoyances, and tricks of the trade that you might otherwise miss. These will help you write XML quickly and effectively. Note Notes point out items that you should be aware of, although you can skip these if you’re in a hurry. Generally, notes are a way for you to gain some extra information on a topic without weighing yourself down. Caution Pay attention to Cautions! These could save you precious hours in lost work. Don’t say we didn’t warn you. Each chapter also ends with a “Troubleshooting” or “Getting Down to Cases” element. A “Troubleshooting” section helps you overcome common problem with using XML. A “Getting Down to - 7 - Cases” section shows you how the subject of the chapter applies to everyday XML development or provides extra information that is interesting or useful. XML Fundamentals Part I: Chapter List Chapter 1: Introduction to XML Chapter 2: Understanding XMLSyntax Chapter 3: Understanding XMLDocument Type Definitions Chapter 4: Extending a Document Type Definition with Local Modifications Chapter 5: Building a Document Type Definition fromScratch Chapter 6: XMLNamespaces Chapter 7: XMLSchemes Chapter 8: Xpath Chapter 9: XLink and XPointer Introduction to XML Chapter 1: Making All Things Possible with XML The Internet community is pouring an enormous amount of energy, money, and effort into developing an extensive suite of related standards centered around XML, Extensible Markup Language, the next generation of document delivery methods on the Web. In 1999 alone, more standards and drafts, almost all XML-related, have been delivered or proposed than in the history of the World Wide Web Consortium (W3C), the body responsible for Web standards. In 2000, several dozen more XML-related standards will be delivered, doubling the number of W3C standards and extending the cohesive power of XML into all corners of the Worldwide Web. XML and its related standards allow you to replace or extend proprietary tagging systems, such as Allaire’s Cold Fusion and Microsoft’s Active Server Pages (ASP), with platform-independent languages that fit the problem space of your page precisely. Instead of (or in addition to) inserting special tags or comments explaining what a particular field means, the field itself can be made meaningful to both applications and human readers. So an annotated price list in HTML which might look like this <!--Price list for individual fruits --> <dl> <!-- Fruit --> <dt>Apples</dt> <!-- Price --> <dd>$1</dd> <!-- Fruit --> <dt>Oranges</dt> <!-- Price --> <dd>$2</dd> </dl> can be made to look like this: <FruitPriceList> <Fruit>Apples</Fruit> <Price>$1</Price> <Fruit>Oranges</Fruit> <Price>$2</Price> </FruitPriceList> - 8 - The above shows a tiny example of what can be accomplished in making data easier to access using XML. Not only is the information less cluttered and more clearly presented, but also the fields are identifiable by a search engine. So, apples to eat can be readily distinguished from the Big Apple (New York City) and the apple of one’s eye (a person or thing one is fond of). Whereas we had to fit our HTML data into the Procrustean bed of an HTML definition list to lay out the list in the manner we wanted, in XML we can let the data structure flow from the data itself, and use XML-related standards like Cascading Style Sheets (CSS) or Extensible Stylesheet Language (XSL) to format the page. Also, the XML version allows us to retain information about the type of data entered in every field. HTML allows us to identify only a half-dozen or so datatypes: abbreviations, acronyms, addresses, block quotes, citations, and variables. And even these are most often (mis)used to affect formatting rather than to identify a logical field. Understanding HTML Limitations As the Web has grown over the past 10 years, users have discovered more and more ways of communicating with each other. The foundation of that interchange has been HTML, the Hypertext Markup Language. HTML has been used to present everything from scholarly papers to online catalogs and poetry. However, the structure of Web pages based in HTML says little about the actual information content. The example of a definition list shown previously illustrates the fact that most of the tags offered by HTML affect only the crude layout and presentation of the text on the page, and even that layout information is inflexible. Tags are often used to present information in ways that stretch or violate the meaning of the tags themselves. The XTML definition list coded previously had no definitions listed, for example, but was used only to line up fruits and prices in a particular way. So, there have been many workarounds to try and impose some sort of flexible order on the data contained on the page. Many of these ad hoc solutions, such as ASP or Cold Fusion, have been moderately successful. For the most part, however, they represent proprietary approaches that have to be reinvented for each new problem domain or require the use of specific server software and hardware that may not fit into your purchasing plans. By now, you’ve probably run into the limitations of HTML. You’ve experienced the frustration of not being able to describe exactly what you want to do using the structures available to you as a Web designer or author. You’ve probably been forced to use inaccessible mechanisms, such as frames or tables, to coerce your page into an unreliable typographical layout, or you’ve used ugly <pre> preformatted and/or <tt> typewriter text tags to align data properly. XML is a new standard that allows you to extend the descriptive power of your document almost at will and alter it to suit different purposes as needed. XML makes a lot of things possible that were only vague yearnings before. Improving Precision with XML XML enables you to describe your document exactly in a way that can be “understood” by a machine. Although humans have no trouble looking at a page and deducing what certain layouts mean, such as an invoice, for example, computers aren’t quite that smart. They need help. Descriptive XML tags such as <seller> and <price> make far more sense to machines than the anonymous layout tags that HTML currently provides. XML provides a mechanism, the Document Type Definition (DTD), which lets you share knowledge about the structure of your data with anyone you choose. Validating Document Structure with XML XML enables you to force validation of the structure of your document. You can enforce the presence of certain items, while making others optional, and link one structure with another. In other words, if you choose to include Item ABC, you can force an instance of Item XYZ to go with it. Alternatively, if ABC is present, then XYZ can be excluded. Introducing Layout Flexibility with XML XML makes it possible to truly vary the presentation of documents according to their intended use. Instead of compelling you to decide whether a data set is better presented as a list or as a table, you can present it in different ways for different purposes. On the printed page, a table is useful and hyperlinks are worthless, but in an audio browser for the blind, a hyperlink-navigable set of lists might be - 9 - more accessible. Entering the same data into a database might require normalization and other transformations that would be superfluous elsewhere. A single XML source can support all these uses. Achieving Platform Independence with XML XML is completely platform-independent and extremely robust. No other data transport or distributed processing mechanism can make this claim. XML is text-based. You can look at the raw data and make perfect sense of it. Because XML describes a simple flat-file database, every application using any sort of database access can use XML as a lowest common denominator for transport, generating and translating from XML for transfer while using normalized or proprietary formats internally. Almost all database applications already have the capability to create a comma-delimited flat-file equivalent of database records, so expanding the commas into XML tags is almost trivial. Designing Object-Oriented Documents with XML Although the long-term success of object-oriented techniques is still debatable, XML is developing in ways that will eventually support object-oriented programming and design methods as an option. Although the current standard is slightly improved over the standard block sequential model, initiatives such as SOX, the Schema for Object-oriented XML, or equivalents will provide fully object-oriented access to database-like and other elements of XML structures. Redefining the Possibilities of Web Development HTML currently operates as an elaborate and sophisticated virtual fax machine originally designed to allow scholars to share copies of academic papers with each other on request. HTML layout tags mirror the physical structure of such a paper, primarily a simple outline with associated text and graphics. One labels headers, paragraphs, lists, tables, and illustrations. The default layout is linear, with text proceeding in an orderly fashion down the page. The few meaningful tags, <CITE>, <ADDRESS>, and so on, are designed to highlight such things as article and book citations, a scholarly concern, and the address of the author. Everything else that goes into the make up a research paper, the 3?5 cards, the calculations and categorizations, the deep understanding of the subject matter structure, is just dumped in the trash on the way to the hyperlinked faux paper. It’s not much to work with really. Everything else, the entire structure of the current Web, has been piled on top of that early design. It’s a tribute to the original designers that everything still works as well as it does. However, the difference between text and data is that data has structure and context. Human beings can recognize or extrapolate structure and context from visual cues, so for purely human interaction HTML is sufficient. But computers don’t think that way. Or think at all for that matter. Data Granularity and Structure XML is designed to allow every meaningful division of a document to be unambiguously identified as part of a coherent tree structure that either a human or a computer can use. So an entire car can be described as a complete parts list, with everything from the engine to nuts and bolts broken down into lists of components. Or a book can be described as a collection of the chapters, paragraphs, footnotes, illustrations, and so on that make it up. It’s a stunning concept, although it can’t yet capture data structures that are not tree-like. Many abstractions cannot be represented as trees. The meaning of “honor” may be clear to a Marine but you can’t disassemble it into component parts like a car, although there are clearly related concepts that inform the concept of honor. So “honor” is undoubtedly related to “fidelity” in some way, but the exact relationship disappears into the sort of vague cloud that network designers are fond of and philosophers can extrapolate into lengthy books. Likewise, the Web itself is not a tree, but an enormously complex network of independent nodes linked in a directed graph with no particular starting point or root. (Figures 2.1 and 2.2 in the next chapter may make this clearer.) A tree has a single root and every branch is separate. The database-like applications of XML, such as describing the component parts of a car, are usually more straightforward than describing complex objects such as a book. It may seem odd to think that a book is more complicated than a car, but it’s true. The problem is one of exclusion, making sure that an - 10 -

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.